Sorting Algorithms: Selection, Bubble, and Tree Sort
Introduction to Sorting Algorithms
Conceptual Overview:
- Sorting algorithms are fundamental to computer science and a core skill expected of software developers.
- The primary goal is to take a set of unsorted data and organize it into a sorted order.
- Studying different sorting algorithms provides excellent practice for general problem-solving and algorithmic thinking.
- Rather than sticking with a naive or slow first solution, developers should dig deeper to find the most efficient approach for a given problem.
Factors Influencing Algorithm Choice:
- Input State: Whether the data is already mostly sorted, completely unsorted, or reverse-sorted impacts performance.
- Memory Restrictions: Some environments have limited memory (e.g., embedded systems) where sorting must be done "in-place."
- Output Requirements: Sometimes only a general sort is required, or specific elements are more important than others.
Big O Notation in Sorting:
- Big O measures how long an algorithm takes to run relative to the number of elements, denoted as .
- Common complexities include , , and sometimes .
- An complexity often occurs because the algorithm must compare every element against every other element.
Selection Sort
Definition: A naive, simple algorithm that sorts an array or list of elements by repeatedly finding the smallest remaining element.
Mechanism:
- The algorithm scans the entire list to find the smallest element.
- This element is placed in the first position (index 0).
- The algorithm then scans the remaining unsorted list to find the next smallest element.
- This element is placed in the second position (index 1).
- This process repeats until the list is fully sorted, with the range of the scan getting smaller each time.
Implementation Logic:
- Utilizes nested loops.
- The outer loop moves the boundary between the sorted and unsorted portions of the array.
- The inner loop searches for the smallest value in the unsorted portion.
- Smallest Index Tracker: Instead of storing the smallest value directly, it is often better to store the
smallest_indexand swap values after the inner loop finishes. - Swap Logic: A temporary variable is required to swap the current element with the found smallest element to prevent overwriting data.
Performance Metrics:
- Best Case:
- Average Case:
- Worst Case:
Use Cases and Advantages:
- Very easy to implement for quick solutions.
- Appropriate for small arrays (e.g., 10 elements) where the difference between and is negligible.
- In-place Sorting: It does not require extra memory or the allocation of a second array.
Bubble Sort
Definition: An algorithm that sorts a list by repeatedly stepping through the list, comparing adjacent elements and swapping them if they are in the wrong order.
Mechanism:
- Start at the first index and compare the current element with the one directly next to it.
- If the left element is greater than the right element (), swap them.
- Continue this until the end of the list; the largest element will "bubble up" to the final position.
- Repeat the process for the remaining unsorted portion of the list.
Implementation Variations:
- While Loop with Boolean Flag: A
sortedboolean can be used as a stopping condition. If a full pass is completed without any swaps, the list is sorted, and the algorithm can stop early. - Nested For Loops: Alternatively, two for loops can be used. The outer loop tracks the number of passes, and the inner loop handles comparisons.
- Optimization: Each pass places the next largest element in its correct final position, so the inner loop range can decrease by one () with each pass.
- While Loop with Boolean Flag: A
Performance Metrics:
- Best Case: (This occurrs when the list is already sorted and the algorithm uses a stopping condition).
- Average Case:
- Worst Case: (Occurs when the list is in reverse order).
Advantages:
- Effective for data sets that are "mostly sorted" or when only one or two elements are out of place.
- In-place sorting (no extra memory required).
- Highly readable and simple logic.
Tree Sort
Definition: A sorting algorithm that utilizes a Binary Search Tree (BST) structure to organize data, followed by an in-order traversal to retrieve elements in sorted order.
Mechanism:
- Create an empty Binary Search Tree.
- Iterate through the original array and insert every element into the BST.
- Perform an In-Order Traversal of the BST (Left Subtree → Current Node → Right Subtree).
- As nodes are visited during the traversal, place them back into the array or print them.
Performance Metrics:
- Average and Best Case: (Due to the logarithmic efficiency of BST insertions).
- Worst Case: (Occurs if data is entered in sorted order, creating a "linked list" style tree).
- Space Complexity: Worst case is extra space because a complete copy of the data is stored in the tree structure.
Optimizations:
- Using height-balanced trees like AVL Trees or Red-Black Trees can guarantee the complexity even in the worst case by using rotations to prevent the tree from becoming a linked list.
Considerations:
- Loading data from an array into a BST just to pull it back out into an array can be redundant if the data can simply remain in the BST for fast retrieval and manipulation.
Questions & Discussion
- Question: How many people have seen Selection Sort before?
- Answer: Most of the class raised their hands.
- Question: Why use Selection Sort if it is slow?
- Answer: It is useful for small data sets where speed is not critical, for making quick solutions without needing built-in libraries, or if memory is highly restricted since it is an in-place sort.
- Interaction (Selection Sort Implementation):
- The speaker encountered a Java error:
Cannot find smallest index. This was because the variable was initially namedsmallestbut referenced assmallest indexlater.
- The speaker encountered a Java error:
- Interaction (Bubble Sort Logic):
- The speaker initially placed an incrementer
i++inside the wrong loop. A student pointed out that incrementingiinside the for-loop would cause it to reach the length of the array too quickly, breaking the logic of the while-loop. - The speaker encountered a
java.lang.ArrayIndexOutOfBoundsException: Index 14 out of bounds for length 14. This was an "off-by-one" error caused by comparingjtoj+1at the very end of the array.
- The speaker initially placed an incrementer
- Question: Would the Big O notation for Tree Sort be different if it was an AVL tree?
- Answer: Yes, it would be faster in the worst case (guaranteed ) because rotations keep the tree balanced, preventing the "linked list" scenario.
- Algorithmic "Trick": To avoid the worst-case scenario for some algorithms, one can shuffle the data completely at random before sorting. This converts a potential worst-case scenario into an average-case scenario.
News & Administrative Updates
Schedule:
- Wednesday/Friday: Lectures on Quick Sort, Merge Sort, and Heap Sort.
- Next Monday: Public holiday (No class).
- Next Wednesday: Test preparation session.
- Next Friday: Test 2.
Test 2 Format Changes:
- The True/False section from last year is being removed because it was awkward and felt like a "coin flip" for students.
- The test will focus less on memorizing random small facts and more on applying algorithms.
- New Content: Expect diagram-based questions, such as traversing a graph or inserting elements into a heap (handling the "up" or "down" movement of elements).