Trie (Prefix Tree)

The Ultimate String Data Structure

A Trie, also known as a prefix tree or digital tree, is a tree-based data structure that proves to be very efficient for solving problems related to strings, especially for retrieval based on prefixes. Instead of storing entire keys in nodes, a node's position in the trie represents the key itself. This structure is the engine behind autocomplete and spell-checker features.

1. Anatomy of a Trie Node

Unlike a binary tree node, a Trie node doesn't store the character it represents. Instead, its position in the tree does. A typical Trie node contains:

  • Children Array/Map: An array of pointers (or a hash map) to its children nodes. For lowercase English letters, this is often an array of size 26. children['c'] would point to the node representing 'c'.
  • End of Word Flag: A boolean flag (e.g., isEndOfWord) to mark if the path from the root to this node forms a complete word.
class TrieNode:
    def __init__(self):
        self.children = {} # map character to TrieNode
        self.isEndOfWord = False

2. Core Operations

Let L be the length of the string being processed.

Insertion

To insert a word, we traverse the trie from the root, character by character. If a path for a character doesn't exist, we create a new node. After the last character, we set the final node's isEndOfWord flag to true.

Inserting "TEA" and "TED"

    (root)
      |
      T
      |
      E
     / \
    A*  D*
    

* denotes isEndOfWord = true

Time: O(L) | Space: O(L * alphabet_size) in worst case

Search

To search for a word, we traverse the trie. If we can follow the path for the entire word and the final node's isEndOfWord flag is true, the word exists.

Time: O(L)

Prefix Search (startsWith)

To check if any word starts with a given prefix, we traverse the trie. If we can successfully traverse the trie for the entire prefix, it means there is at least one word in the trie with that prefix.

Time: O(L)

3. Classic Problems

Implement Trie (Prefix Tree)

This problem is a direct implementation of the core Trie data structure and its three main methods: insert, search, and startsWith. It's the best way to solidify your understanding.

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word: str) -> None:
        curr = self.root
        for c in word:
            if c not in curr.children:
                curr.children[c] = TrieNode()
            curr = curr.children[c]
        curr.isEndOfWord = true

    def search(self, word: str) -> bool:
        curr = self.root
        for c in word:
            if c not in curr.children:
                return False
            curr = curr.children[c]
        return curr.isEndOfWord

    def startsWith(self, prefix: str) -> bool:
        curr = self.root
        for c in prefix:
            if c not in curr.children:
                return False
            curr = curr.children[c]
        return True
Word Search II

Given an m x n grid of characters and a list of words, find all words on the grid. This is a famous hard problem that combines Trie and backtracking (DFS). First, you insert all words into a Trie. Then, you perform a DFS from every cell in the grid. As you traverse the grid, you also traverse the Trie. If you find a path in the grid that spells a word in the Trie (marked by isEndOfWord), you add it to your results. This is far more efficient than searching for each word individually.

class Solution:
    def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
        root = TrieNode()
        for w in words:
            curr = root
            for c in w:
                if c not in curr.children:
                    curr.children[c] = TrieNode()
                curr = curr.children[c]
            curr.isEndOfWord = True
            curr.word = w # Store the word in the node
            
        ROWS, COLS = len(board), len(board[0])
        res, visit = set(), set()
        
        def dfs(r, c, node):
            if (r < 0 or c < 0 or r == ROWS or c == COLS or
                (r, c) in visit or board[r][c] not in node.children):
                return
            
            visit.add((r, c))
            node = node.children[board[r][c]]
            if node.isEndOfWord:
                res.add(node.word)
            
            dfs(r + 1, c, node)
            dfs(r - 1, c, node)
            dfs(r, c + 1, node)
            dfs(r, c - 1, node)
            visit.remove((r, c))

        for r in range(ROWS):
            for c in range(COLS):
                dfs(r, c, root)
        
        return list(res)

Interactive Trie

Visualize a Trie data structure. Insert words and see how the tree is built. Search for words or prefixes.