Data Collections Algorithms in Java: Searching, Sorting, and Recursion
Searching
When you search a data collection, you’re trying to answer a simple question: “Where is the item I want (if it exists)?” In AP Computer Science A, the “collection” is typically an array (like int[]) or an ArrayList (like ArrayList<String>). Searching matters because so many real programs depend on it—finding a student record by ID, checking whether a username is taken, locating the next song in a playlist, or determining whether a value appears in sensor readings.
A key idea to internalize early: how you search depends on what you know about the data. If the data is unsorted, you generally can’t do better (conceptually) than checking items until you find what you need. If the data is sorted, you can leverage that structure to skip large portions.
Linear Search (Sequential Search)
Linear search checks elements one-by-one from the start of the collection until it finds the target or runs out of elements. It’s the “scan the whole list” strategy.
Why it matters: Linear search is simple, works on any list (sorted or not), and is easy to implement correctly. In AP CSA, you’ll often use it in situations where you don’t control the input order or where sorting first would be unnecessary overhead.
How it works (step-by-step):
- Start at index 0.
- Compare the current element to the target.
- If it matches, return the index (or
true, depending on the method’s purpose). - Otherwise move to the next index.
- If you reach the end, the target isn’t present.
Linear search in action (array example):
public class SearchDemo {
/** @return index of target in arr, or -1 if not found */
public static int linearSearch(int[] arr, int target) {
for (int i = 0; i < arr.length; i++) {
if (arr[i] == target) {
return i;
}
}
return -1;
}
}
If arr = {4, 9, 2, 9} and target = 9, this returns 1 (the first match). That “first match” detail is important: a linear search typically finds the earliest occurrence unless you intentionally keep searching.
ArrayList variant (objects and .equals):
When searching for objects like String, you usually want .equals, not ==.
import java.util.ArrayList;
public class SearchDemo2 {
public static int linearSearch(ArrayList<String> list, String target) {
for (int i = 0; i < list.size(); i++) {
if (list.get(i).equals(target)) {
return i;
}
}
return -1;
}
}
What goes wrong (common pitfalls):
- Using
==instead of.equalsfor objects.==checks whether two references point to the same object, not whether two objects have the same content. - Off-by-one errors in loop bounds (
i <= arr.lengthwill crash). - Forgetting to return a “not found” value (
-1is a common convention for index-returning methods).
Binary Search (Sorted Data Only)
Binary search is a searching strategy that only works when the data is sorted (in ascending order, typically). Instead of checking items one-by-one, it repeatedly checks the middle element and eliminates half the remaining search space each step.
Why it matters: If you have a sorted list, binary search is dramatically faster in principle than linear search on large inputs. More importantly for AP CSA, it tests whether you understand how sorted order enables algorithmic shortcuts, and it’s a classic code-tracing and reasoning topic.
How it works (step-by-step):
- Keep track of a low index (
low) and high index (high) bounding the portion you’re searching. - Compute the middle index (
mid). - Compare the middle element to the target:
- If equal, you found it.
- If the middle is too small, discard the left half by moving
lowup. - If the middle is too large, discard the right half by moving
highdown.
- Repeat until
low > high(meaning the target isn’t present).
Binary search in action (iterative array version):
public class BinarySearchDemo {
/** Precondition: arr is sorted in ascending order.
* @return index of target or -1 if not found
*/
public static int binarySearch(int[] arr, int target) {
int low = 0;
int high = arr.length - 1;
while (low <= high) {
int mid = (low + high) / 2;
if (arr[mid] == target) {
return mid;
} else if (arr[mid] < target) {
low = mid + 1;
} else {
high = mid - 1;
}
}
return -1;
}
}
Important nuance: Binary search may find an occurrence, not necessarily the first, when duplicates exist.
What goes wrong (common pitfalls):
- Forgetting the sorted-data requirement. Binary search on unsorted data can return wrong answers.
- Updating bounds incorrectly (for example, writing
low = midinstead oflow = mid + 1can cause infinite loops). - Mixing up
<and>comparisons or reversing the logic when the data is descending.
Exam Focus
- Typical question patterns:
- Trace a linear or binary search loop and report the returned value (index or boolean).
- Identify whether binary search is valid given how the array/list is described.
- Write or complete a method that searches for a target and returns an index or presence.
- Common mistakes:
- Using
==forStringcomparison instead of.equals. - Off-by-one errors in
while (low <= high)and in updatinglow/high. - Assuming binary search returns the first occurrence among duplicates.
- Using
Sorting
Sorting means rearranging elements into a particular order—usually ascending (smallest to largest) or alphabetical. Sorting is not just about neatness; it enables faster searching (binary search requires sorted data), easier duplicate detection, and more meaningful reporting (rankings, leaderboards, ordered logs).
In AP CSA, sorting is commonly taught through simple, comparison-based algorithms that you can implement with loops and array indexing. The emphasis is usually on understanding how the algorithm manipulates the array and being able to trace it accurately.
Comparing and Swapping (the basic move)
Most introductory sorts rely on the same primitive operations:
- Compare two elements to decide which should come first.
- Swap elements to move them toward their correct positions.
A standard swap pattern for an int[]:
public static void swap(int[] arr, int i, int j) {
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
Mistakes here tend to be catastrophic (overwriting a value without saving it first), so it’s worth making this pattern automatic.
Selection Sort
Selection sort repeatedly selects the smallest remaining element and places it into its final position at the front of the array.
Why it matters: Selection sort is straightforward to reason about: after pass i, you know index i is correct. That makes it a common algorithm for code-tracing questions.
How it works (step-by-step):
- For position
ifrom 0 ton - 2: - Find the index of the smallest element in the range
ithroughn - 1. - Swap that smallest element into position
i.
Selection sort in action:
public class SelectionSortDemo {
public static void selectionSort(int[] arr) {
for (int i = 0; i < arr.length - 1; i++) {
int minIndex = i;
for (int j = i + 1; j < arr.length; j++) {
if (arr[j] < arr[minIndex]) {
minIndex = j;
}
}
int temp = arr[i];
arr[i] = arr[minIndex];
arr[minIndex] = temp;
}
}
}
If arr = {5, 2, 8, 1}, then:
- After
i = 0, smallest is1, swap into index 0 ⇒{1, 2, 8, 5} - After
i = 1, smallest in remaining is2⇒ unchanged - After
i = 2, smallest in remaining is5⇒{1, 2, 5, 8}
What goes wrong:
- Swapping every time you find a smaller element (that becomes a different algorithm and is easy to mess up). In selection sort, you typically swap once per outer loop.
- Letting
jstart atiinstead ofi + 1(works but causes redundant comparisons and can confuse your tracing).
Insertion Sort
Insertion sort builds a sorted prefix of the array by repeatedly taking the next element and inserting it into the correct place among the already-sorted elements.
Why it matters: Insertion sort resembles how people sort playing cards in their hands. It’s also a great way to practice reasoning about nested loops where the inner loop moves backward.
How it works (step-by-step):
- Treat index 0 as a sorted “prefix” of length 1.
- For each next index
i(from 1 to end):- Store the element as
key. - Shift larger elements in the prefix one position to the right.
- Insert
keyinto the hole that remains.
- Store the element as
Insertion sort in action:
public class InsertionSortDemo {
public static void insertionSort(int[] arr) {
for (int i = 1; i < arr.length; i++) {
int key = arr[i];
int j = i - 1;
while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
j--;
}
arr[j + 1] = key;
}
}
}
For arr = {5, 2, 8, 1}:
- Insert
2into{5}by shifting5right ⇒{2, 5, 8, 1} - Insert
8into{2, 5}⇒ stays{2, 5, 8, 1} - Insert
1into{2, 5, 8}by shifting all right ⇒{1, 2, 5, 8}
What goes wrong:
- Forgetting to write
arr[j + 1] = keyafter the shifting loop. - Using
arr[j] >= keyinstead ofarr[j] > keychanges how ties behave (it can still sort correctly, but affects whether equal elements keep their original relative order). - Letting
jfall to-1and then writing toarr[j]instead ofarr[j + 1].
Sorting Objects: compareTo and Comparator
In AP CSA, you’ll often sort primitives in examples, but real collections store objects. For objects, “less than” isn’t built-in—you define ordering.
- Many Java classes (like
String) implementComparable, meaning they have a.compareTomethod. .compareToreturns:- a negative number if
thisis “less than” the other, - 0 if equal in ordering,
- a positive number if “greater than.”
- a negative number if
Example comparison:
String a = "apple";
String b = "banana";
System.out.println(a.compareTo(b)); // negative, because "apple" comes before "banana"
If you implement sorting on objects yourself, you’ll typically use compareTo instead of < and >.
Exam Focus
- Typical question patterns:
- Trace one or more passes of selection sort or insertion sort and show the resulting array.
- Identify what part of the array is guaranteed sorted after a loop iteration.
- Modify a sorting method (often by changing comparison direction) to sort descending.
- Common mistakes:
- Confusing which index represents the “current minimum” (selection sort) or the “key” (insertion sort).
- Swapping or shifting in the wrong place, especially inside nested loops.
- Using
==to compare objects when sorting/searching instead of.equals/.compareTo.
Recursion
Recursion is a technique where a method solves a problem by calling itself on a smaller or simpler version of the same problem. Even though recursion can feel mysterious at first, it’s really about one powerful idea: “If I can solve a smaller case, I can build up to the full solution.”
Why it matters: Recursion is a natural fit for problems that have a self-similar structure—like processing nested data, exploring branching possibilities, or repeatedly subdividing a range (which connects directly to binary search and merge sort). It also strengthens your ability to trace program state carefully, which is a core AP CSA skill.
The Two Non-Negotiables: Base Case and Recursive Case
Every correct recursive method needs:
- A base case: a condition where the method stops calling itself.
- A recursive case: the part where the method calls itself with a smaller input, moving toward the base case.
If you don’t make progress toward the base case, you’ll end up with infinite recursion and likely a StackOverflowError.
Thinking About the Call Stack
Each method call gets its own “frame” on the call stack, storing parameters and local variables. With recursion, you build up many frames before returning back down.
A common misconception is that a recursive call “jumps back to the top and forgets everything.” It doesn’t. Each call remembers its own local state, and returns a value to its caller when it finishes.
Example 1: Factorial (classic structure)
Factorial is often used to teach recursion because it has a clean definition.
- Base case:
0! = 1 - Recursive case:
n! = n * (n-1)!
public class RecursionBasics {
/** Precondition: n >= 0 */
public static int factorial(int n) {
if (n == 0) { // base case
return 1;
}
return n * factorial(n - 1); // recursive case
}
}
If you trace factorial(3), you get:
factorial(3) = 3 * factorial(2)factorial(2) = 2 * factorial(1)factorial(1) = 1 * factorial(0)factorial(0) = 1
Then returns unwind:1⇒1⇒2⇒6.
What goes wrong:
- Missing or incorrect base case (
if (n == 1)only works if you never call it with0). - Changing
nin the wrong direction (factorial(n + 1)diverges).
Example 2: Recursive Summation (seeing “smaller input”)
public static int sumToN(int n) {
if (n <= 0) {
return 0;
}
return n + sumToN(n - 1);
}
Here, the “smaller problem” is summing to n - 1. The base case is when n is non-positive.
When Recursion Is a Good Idea (and when it isn’t)
Recursion is great when:
- The problem naturally breaks into smaller subproblems (divide-and-conquer).
- The structure is nested (like a directory tree or nested parentheses).
Recursion can be a poor fit when:
- A simple loop is clearer and avoids deep call stacks.
- The recursion depth could be very large (risking stack overflow).
AP CSA often emphasizes writing clear, correct code over using recursion “just because.” If an iterative solution is simpler, it’s usually preferred unless the problem is explicitly about recursion.
Exam Focus
- Typical question patterns:
- Identify the base case and describe what smaller input each recursive call uses.
- Trace a recursive method and determine its return value.
- Fill in missing code for a recursive helper method that processes a range (indices).
- Common mistakes:
- No base case, or a base case that doesn’t actually stop all paths.
- Recursive call that doesn’t move toward the base case (no progress).
- Confusing what gets returned at each step (especially in methods that combine results).
Recursive Searching and Sorting
Recursion becomes especially powerful when paired with divide-and-conquer thinking: split a problem into smaller pieces, solve those pieces (often recursively), then combine the results. This approach naturally leads to recursive searching (binary search) and recursive sorting (merge sort is the classic example).
Recursive Binary Search
You can express binary search recursively by searching either the left half or the right half as a smaller subproblem.
What it is: A recursive method that searches within a bounded range (low to high) and narrows that range each call.
Why it matters: It reinforces the key binary-search invariant: “If the target is present, it must be inside this current range.” Recursion makes that invariant explicit through parameters.
How it works:
- Base case 1: the range is empty (
low > high) ⇒ not found. - Base case 2: middle element equals target ⇒ found.
- Recursive case: search the appropriate half.
public class RecursiveBinarySearch {
/** Precondition: arr is sorted in ascending order. */
public static int binarySearch(int[] arr, int target) {
return binarySearch(arr, target, 0, arr.length - 1);
}
private static int binarySearch(int[] arr, int target, int low, int high) {
if (low > high) {
return -1; // empty range
}
int mid = (low + high) / 2;
if (arr[mid] == target) {
return mid;
} else if (arr[mid] < target) {
return binarySearch(arr, target, mid + 1, high);
} else {
return binarySearch(arr, target, low, mid - 1);
}
}
}
What goes wrong:
- Using
midinstead ofmid + 1ormid - 1so the range doesn’t shrink (infinite recursion). - Forgetting the base case
low > high. - Calling recursive binary search on an unsorted array.
Recursive Sorting with Merge Sort
Merge sort is a recursive sorting algorithm that splits the array in half, recursively sorts each half, and then merges the two sorted halves into one sorted array.
What it is: A divide-and-conquer sort with a clear recursive structure.
Why it matters: Merge sort is a canonical example of recursion applied to data collections. Even if you aren’t required to memorize every detail, understanding merge sort helps you:
- see how recursion can organize complex tasks,
- practice reasoning about helper methods and array ranges,
- understand the “split, solve, combine” pattern that shows up in many algorithms.
How it works (high-level):
- Split: Divide the array into left half and right half.
- Solve: Recursively sort each half.
- Combine: Merge the two sorted halves.
Base case: An array (or subarray range) of length 0 or 1 is already sorted.
Below is a merge sort implementation using helper methods and temporary arrays. It’s longer than selection/insertion sort because the merge step requires extra storage.
import java.util.Arrays;
public class MergeSortDemo {
public static void mergeSort(int[] arr) {
if (arr.length <= 1) {
return; // base case
}
int mid = arr.length / 2;
int[] left = Arrays.copyOfRange(arr, 0, mid);
int[] right = Arrays.copyOfRange(arr, mid, arr.length);
mergeSort(left);
mergeSort(right);
merge(arr, left, right);
}
// Merges sorted left and sorted right into result
private static void merge(int[] result, int[] left, int[] right) {
int i = 0; // left index
int j = 0; // right index
int k = 0; // result index
while (i < left.length && j < right.length) {
if (left[i] <= right[j]) {
result[k] = left[i];
i++;
} else {
result[k] = right[j];
j++;
}
k++;
}
while (i < left.length) {
result[k] = left[i];
i++;
k++;
}
while (j < right.length) {
result[k] = right[j];
j++;
k++;
}
}
}
Understanding the merge step (the core idea):
Merging works because both halves are already sorted. You keep two “pointers” (indices), compare the current smallest remaining element in each half, and move the smaller one into the result. This is similar in spirit to combining two already-sorted stacks of papers into one sorted stack.
What goes wrong:
- Forgetting the “copy leftovers” loops after the main merge loop (one side often has remaining elements).
- Mixing up indices (
i,j,k) so values overwrite or you skip positions. - Assuming merge sort can be done with only swapping in place like selection sort—merge sort usually needs temporary arrays to merge cleanly.
Connecting the Dots: Why Recursion Fits These Algorithms
Recursive binary search and merge sort both rely on the same mental model:
- Define the problem on a range (or subarray).
- Make a decision that reduces the problem size.
- Stop when the range is trivially small (empty, one element).
Once you get comfortable with this model, many “recursive array” problems become less intimidating, because you always ask:
- What’s my base case?
- How do I reduce the problem?
- How do I combine results (if needed)?
Exam Focus
- Typical question patterns:
- Trace recursive binary search and determine the returned index (pay attention to
low,high, andmid). - Identify the base case and the smaller subproblem in a recursive range-processing method.
- For merge sort–style questions, reason about what the merge step produces given two sorted halves.
- Trace recursive binary search and determine the returned index (pay attention to
- Common mistakes:
- Recursive calls that don’t shrink the interval (
low..high), causing infinite recursion. - Not understanding that merge requires two sorted inputs; trying to “merge” unsorted halves.
- Dropping elements during merge by failing to copy the remaining tail of one half.
- Recursive calls that don’t shrink the interval (