Arrays
The Most Fundamental Data Structure
Arrays are contiguous blocks of memory holding elements of the same type. This contiguous nature is their superpower, allowing for O(1) random access, but it's also their weakness, making insertions and deletions in the middle expensive.
0. Array Fundamentals
An array is a collection of items stored at contiguous memory locations. The idea is to store multiple items of the same type together. This makes it easy to calculate the position of each element by simply adding an offset to a base value, i.e., the memory location of the first element of the array (generally denoted by the name of the array).
Array in Memory
- Address 1000: Element 0
- Address 1004: Element 1
- Address 1008: Element 2
- Address 1012: Element 3
Logical View
We access this as my_array[2]
, and the computer knows to go to address 1000 + (2 * 4)
.
Static vs. Dynamic Arrays: Static arrays have a fixed size determined at compile time. Dynamic arrays (like C++'s std::vector
or Java's ArrayList
) can resize themselves when they run out of space, which involves allocating a new, larger block of memory and copying the old elements over.
# Python's list is a dynamic array
my_list = [10, 20, 30]
# Add to the end (amortized O(1))
my_list.append(40)
# Insert at index 1 (O(n))
my_list.insert(1, 15)
# my_list is now [10, 15, 20, 30, 40]
# Remove the last element (O(1))
my_list.pop()
# Remove element at index 1 (O(n))
my_list.pop(1)
# my_list is now [10, 20, 30]
1. Prefix Sum
A prefix sum array, prefix
, is a powerful technique where prefix[i]
stores the sum of all elements from the beginning of the original array up to index i-1
. After an O(n) preprocessing step to build this array, you can answer any range sum query (e.g., "what is the sum from index i
to j
?") in O(1) time using the formula prefix[j+1] - prefix[i]
.
# O(n) preprocessing
def build_prefix_sum(nums):
prefix = [0] * (len(nums) + 1)
for i in range(len(nums)):
prefix[i+1] = prefix[i] + nums[i]
return prefix
# O(1) query
def query_sum(prefix, i, j):
return prefix[j+1] - prefix[i]
# Example
nums = [1, 2, 3, 4, 5]
prefix_sum_array = build_prefix_sum(nums)
# Sum of elements from index 1 to 3 (2+3+4)
print(query_sum(prefix_sum_array, 1, 3)) # Output: 9
2. Sliding Window & Two Pointers
These are related patterns for processing arrays efficiently, often turning O(n²) brute-force solutions into O(n) ones. They involve using two pointers (left
and right
) to define a "window" or subarray.
- Two Pointers (Opposite Ends): Used on sorted arrays. Pointers start at the beginning and end, moving towards each other. Ideal for finding pairs that meet a condition, like in the "Two Sum II" problem.
- Sliding Window (Same Direction): Pointers start at the same place. The
right
pointer expands the window, and theleft
pointer shrinks it when a condition is violated. Perfect for finding the longest/shortest subarray that meets a certain criteria.
Classic Problem: Longest Substring Without Repeating Characters
This is the canonical sliding window problem. You expand the window with the right pointer and use a hash set to track characters. If you encounter a character already in the set, you shrink the window from the left until the duplicate is removed.
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
char_set = set()
left = 0
max_len = 0
for right in range(len(s)):
while s[right] in char_set:
char_set.remove(s[left])
left += 1
char_set.add(s[right])
max_len = max(max_len, right - left + 1)
return max_len
3. Kadane’s Algorithm
An elegant and efficient O(n) dynamic programming solution for the maximum subarray sum problem. It iterates through the array, keeping track of two values: max_so_far
(the global maximum sum found) and current_max
(the maximum sum ending at the current position). The key insight is that current_max
for the current element is either the element itself or the element plus the current_max
of the previous element. If current_max
ever becomes negative, we reset it to 0.
from typing import List
class Solution:
def maxSubArray(self, nums: List[int]) -> int:
max_so_far = -float('inf')
current_max = 0
for num in nums:
current_max += num
if current_max > max_so_far:
max_so_far = current_max
if current_max < 0:
current_max = 0
return max_so_far
4. Merge Intervals
A common problem type where you are given a collection of intervals and need to merge all overlapping ones.
Classic Problem: Merge Intervals
The standard approach is to first sort the intervals based on their start times. Then, iterate through the sorted intervals, merging the current interval with the previous one if they overlap.
from typing import List
class Solution:
def merge(self, intervals: List[List[int]]) -> List[List[int]]:
if not intervals:
return []
intervals.sort(key=lambda x: x[0])
merged = [intervals[0]]
for i in range(1, len(intervals)):
last_merged = merged[-1]
current_interval = intervals[i]
if current_interval[0] <= last_merged[1]:
last_merged[1] = max(last_merged[1], current_interval[1])
else:
merged.append(current_interval)
return merged
Interactive Array Simulation
Click the buttons to see array operations. For O(n) operations, affected elements will highlight before they shift.