Searching
Finding Needles in Haystacks, Fast
Searching is a core task in programming. This module covers the essential linear search and dives deep into binary search, a powerful O(log n) algorithm for sorted data that can be applied in surprising ways.
1. Linear Search
The simplest search algorithm. It sequentially checks each element of the list until a match is found or the whole list has been searched. It's simple to implement but inefficient for large datasets.
Time: O(n) | Space: O(1)
def linear_search(arr, target):
for i, element in enumerate(arr):
if element == target:
return i # Return index of found element
return -1 # Return -1 if not found
2. Binary Search
Binary search is a highly efficient "divide and conquer" algorithm that works on sorted arrays. It repeatedly divides the search interval in half. If the value of the search key is less than the item in the middle of the interval, it narrows the interval to the lower half. Otherwise, it narrows it to the upper half.
Time: O(log n) | Space: O(1) (for iterative version)
Robust Implementation
Implementing binary search correctly can be tricky, with off-by-one errors being common. Here is a robust template:
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = left + (right - left) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1 # Not found
3. Binary Search on the Answer
A powerful technique for optimization problems where you need to find the minimum or maximum value that satisfies a certain condition. Instead of searching the data, you search for the optimal value within a range of possible answers.
Monotonicity: The key is that the condition must be "monotonic". This means if a value 'x' is a valid solution, then all values on one side of 'x' are also valid solutions. This creates a sorted-like structure of booleans (e.g., F F F F T T T T) where you can binary search for the first 'T'.
Classic Problem: Capacity To Ship Packages Within D Days
We need to find the minimum shipping capacity that can ship all packages within D days. The range of possible answers is from the weight of the heaviest package to the sum of all package weights. We can binary search on this answer range. For a given capacity C
, we write a check(C)
function that greedily determines if it's possible to ship all packages within D days. This check is monotonic: if a capacity C
works, any capacity C' > C
will also work.
class Solution:
def shipWithinDays(self, weights: List[int], days: int) -> int:
def can_ship(capacity):
days_needed = 1
current_weight = 0
for w in weights:
if current_weight + w > capacity:
days_needed += 1
current_weight = w
else:
current_weight += w
return days_needed <= days
left, right = max(weights), sum(weights)
ans = right
while left <= right:
mid_capacity = left + (right - left) // 2
if can_ship(mid_capacity):
ans = mid_capacity
right = mid_capacity - 1
else:
left = mid_capacity + 1
return ans
Classic Problem: Sqrt(x)
Given a non-negative integer x
, compute and return the integer part of its square root. The range of possible answers is from 0 to x
. We can binary search for the largest integer mid
such that mid * mid <= x
. This is another perfect application of binary search on the answer.
class Solution:
def mySqrt(self, x: int) -> int:
if x < 2:
return x
left, right = 2, x // 2
ans = 0
while left <= right:
mid = left + (right - left) // 2
if mid * mid == x:
return mid
if mid * mid < x:
ans = mid
left = mid + 1
else:
right = mid - 1
return ans
Advanced Problem: Median of Two Sorted Arrays
This is a famous hard-level problem. Given two sorted arrays, find the median of the two sorted arrays. The key idea is to binary search for the correct partition in the smaller array. A partition divides an array into two halves. A correct partition is one where all elements in the left combined partition are less than or equal to all elements in the right combined partition. This elegant but complex solution achieves O(log(min(m, n))) time complexity.
class Solution:
def findMedianSortedArrays(self, nums1: List[int], nums2: List[int]) -> float:
A, B = nums1, nums2
total = len(A) + len(B)
half = total // 2
if len(B) < len(A):
A, B = B, A
l, r = 0, len(A) - 1
while True:
i = (l + r) // 2 # A's partition index
j = half - i - 2 # B's partition index
A_left = A[i] if i >= 0 else float('-inf')
A_right = A[i+1] if (i+1) < len(A) else float('inf')
B_left = B[j] if j >= 0 else float('-inf')
B_right = B[j+1] if (j+1) < len(B) else float('inf')
# partition is correct
if A_left <= B_right and B_left <= A_right:
if total % 2: # odd
return min(A_right, B_right)
# even
return (max(A_left, B_left) + min(A_right, B_right)) / 2
elif A_left > B_right:
r = i - 1
else:
l = i + 1
Interactive Binary Search
Visualize how binary search efficiently finds a target in a sorted array.
Enter a number and click search.