Trie (Prefix Tree + Bitwise Trie)

Prefix-based string storage for autocomplete, spell checking, and XOR queries using bitwise tries.

What is a Trie?

A Trie (pronounced "try", from retrieval) is a tree-shaped data structure designed for efficient storage and retrieval of strings over a fixed alphabet. Unlike a binary search tree where each node stores a complete key, in a trie each node represents a single character, and the key is reconstructed by walking from the root down to a marked node.

Core properties:

Every path from the root to a marked node spells out a stored string.
Common prefixes share the same path -- this is the trie's superpower.
The root represents the empty string "".
The depth of a node equals the length of the prefix it represents.

Intuition. Imagine a phone book organized not alphabetically page-by-page, but character-by-character. To look up "apple", you first go to the "a" section, then within that to "ap", then "app", then "appl", then "apple". If many words share the prefix "app" (apply, application, apparatus...), they all share the same first three steps. A trie encodes exactly this structure.

Visual Structure

Storing: ["apple", "app", "apt", "bat", "bad"]

Observations:

"app" and "apple" share the path root -> a -> p -> p. The node at "app" is marked as end-of-word, and the path continues to "apple".
"apt" branches off from "ap" (the second 'p' goes one way, 't' goes another).
"bat" and "bad" share root -> b -> a, then diverge.

How many nodes? In the worst case (no shared prefixes), inserting N words of average length L creates O(N * L) nodes. With heavy prefix sharing, far fewer.

Node Structure

Fixed-size children array (lowercase English letters)

struct TrieNode {
    TrieNode* children[26];   // one slot per letter a-z
    bool isEnd;               // true if a complete word ends here
    int prefixCount;          // how many words pass through this node
    int endCount;             // how many words end exactly here (for duplicates)

    TrieNode() {
        for (int i = 0; i < 26; i++) children[i] = nullptr;
        isEnd = false;
        prefixCount = 0;
        endCount = 0;
    }
};

Why 26? For lowercase English letters 'a' through 'z', we map character c to index c - 'a'. If the alphabet is larger (uppercase + lowercase + digits), increase the array size or use a hash map (discussed later).

Operations with Full Visual Walkthroughs

Insert

Goal: Insert the word into the trie, creating new nodes as needed.

Algorithm:

Start at root.
For each character in the word, compute its index.
If the child at that index is nullptr, create a new node.
Move to the child, increment its prefixCount.
After the last character, mark isEnd = true.

Walkthrough: Inserting "apple" into an empty trie.

Step 0: Start at root
  (root)

Step 1: Process 'a' (index 0)
  No child at index 0 -> create node
  (root)
    \
     a  [prefixCount=1]

Step 2: Process 'p' (index 15)
  No child at index 15 of node 'a' -> create node
  (root)
    \
     a
      \
       p  [prefixCount=1]

Step 3: Process 'p' (index 15)
  No child at index 15 of node 'p' -> create node
  (root)
    \
     a
      \
       p
        \
         p  [prefixCount=1]

Step 4: Process 'l' (index 11)
  No child at index 11 -> create node
  (root)
    \
     a
      \
       p
        \
         p
          \
           l  [prefixCount=1]

Step 5: Process 'e' (index 4)
  No child at index 4 -> create node, mark isEnd=true
  (root)
    \
     a
      \
       p
        \
         p
          \
           l
            \
             e* [prefixCount=1, isEnd=true]

Now insert "app":

Follow existing path: root -> a -> p -> p
All three nodes already exist! Just increment prefixCount along the way.
At the second 'p', mark isEnd = true.

  (root)
    \
     a  [prefixCount=2]
      \
       p  [prefixCount=2]
        \
         p* [prefixCount=2, isEnd=true]  <-- newly marked
          \
           l  [prefixCount=1]
            \
             e* [prefixCount=1]

Key insight: "app" doesn't create any new nodes -- it reuses the existing prefix path.

Search

Goal: Return true if the exact word exists in the trie (not just as a prefix).

Algorithm:

Start at root.
For each character, try to follow the corresponding child.
If any child is nullptr, return false.
After processing all characters, return isEnd of the current node.

Walkthrough: Search for "app" -- FOUND

Trie state (contains "apple" and "app"):
  root -> a -> p -> p* -> l -> e*

  Step 1: 'a' -> follow child[0] -> node 'a'  (exists)
  Step 2: 'p' -> follow child[15] -> node 'p'  (exists)
  Step 3: 'p' -> follow child[15] -> node 'p*' (exists)
  End of word. Check isEnd: TRUE -> return true

Walkthrough: Search for "ap" -- NOT FOUND

  Step 1: 'a' -> follow child[0] -> node 'a'  (exists)
  Step 2: 'p' -> follow child[15] -> node 'p'  (exists)
  End of word. Check isEnd: FALSE -> return false

  "ap" exists as a PREFIX in the trie, but no word ends there.

Walkthrough: Search for "axe" -- NOT FOUND

  Step 1: 'a' -> follow child[0] -> node 'a'  (exists)
  Step 2: 'x' -> follow child[23] -> nullptr   -> return false immediately

StartsWith (Prefix Search)

Goal: Return true if ANY stored word starts with the given prefix.

Algorithm: Same as search, but skip the isEnd check at the end. If we can follow the entire prefix without hitting nullptr, return true.

Walkthrough: startsWith("ap") -> TRUE

  Step 1: 'a' -> node 'a'  (exists)
  Step 2: 'p' -> node 'p'  (exists)
  Reached end of prefix without hitting nullptr -> return true
  (Words "app" and "apple" both start with "ap")

Walkthrough: startsWith("ba") on trie with only "apple"/"app" -> FALSE

  Step 1: 'b' -> child[1] is nullptr -> return false

Delete

Goal: Remove a word from the trie. Must be careful not to remove nodes that serve as prefixes for other words.

Strategy:

First, verify the word exists (search for it).
Walk down the word. At each node, decrement prefixCount.
At the end node, set isEnd = false, decrement endCount.
On the way back up (using recursion), if a node's prefixCount drops to 0, delete it and set the parent's child pointer to nullptr.

Walkthrough: Delete "apple" from trie containing {"apple", "app"}

Before:
  root -> a(2) -> p(2) -> p*(2) -> l(1) -> e*(1)
  (numbers in parentheses are prefixCounts)

Processing 'a': prefixCount 2 -> 1  (still > 0, keep node)
Processing 'p': prefixCount 2 -> 1  (still > 0, keep node)
Processing 'p': prefixCount 2 -> 1  (still > 0, keep node, isEnd stays true)
Processing 'l': prefixCount 1 -> 0  (now 0! delete this node)
Processing 'e': prefixCount 1 -> 0  (now 0! delete this node)

After:
  root -> a(1) -> p(1) -> p*(1)

"app" is still intact. Only the "le" suffix was removed.

Walkthrough: Now delete "app"

Before:
  root -> a(1) -> p(1) -> p*(1)

Processing 'a': prefixCount 1 -> 0  (delete!)
Processing 'p': prefixCount 1 -> 0  (delete!)
Processing 'p': prefixCount 1 -> 0  (delete!, unmark isEnd)

After:
  root  (empty trie)

All nodes cleaned up since no words remain.

Complete Implementation

#include <bits/stdc++.h>
using namespace std;

class Trie {
    struct TrieNode {
        TrieNode* children[26];
        bool isEnd;
        int prefixCount;
        int endCount;  // how many times this exact word was inserted

        TrieNode() : isEnd(false), prefixCount(0), endCount(0) {
            memset(children, 0, sizeof(children));
        }
    };

    TrieNode* root;

    // Helper: recursively delete a word, cleaning up unused nodes
    bool deleteHelper(TrieNode* node, const string& word, int depth) {
        if (depth == (int)word.size()) {
            if (!node->isEnd) return false; // word not found
            node->endCount--;
            if (node->endCount == 0) node->isEnd = false;
            // If node has no children and no words end here, it can be deleted
            return !node->isEnd && isEmpty(node);
        }
        int idx = word[depth] - 'a';
        if (!node->children[idx]) return false; // word not found

        node->children[idx]->prefixCount--;
        bool shouldDelete = deleteHelper(node->children[idx], word, depth + 1);

        if (shouldDelete) {
            delete node->children[idx];
            node->children[idx] = nullptr;
            // Check if current node is also deletable
            return !node->isEnd && isEmpty(node);
        }
        return false;
    }

    bool isEmpty(TrieNode* node) {
        for (int i = 0; i < 26; i++)
            if (node->children[i]) return false;
        return true;
    }

public:
    Trie() { root = new TrieNode(); }

    // --- Insert a word ---
    void insert(const string& word) {
        TrieNode* curr = root;
        for (char c : word) {
            int idx = c - 'a';
            if (!curr->children[idx])
                curr->children[idx] = new TrieNode();
            curr = curr->children[idx];
            curr->prefixCount++;
        }
        curr->isEnd = true;
        curr->endCount++;
    }

    // --- Search for exact word ---
    bool search(const string& word) {
        TrieNode* curr = root;
        for (char c : word) {
            int idx = c - 'a';
            if (!curr->children[idx]) return false;
            curr = curr->children[idx];
        }
        return curr->isEnd;
    }

    // --- Check if any word starts with given prefix ---
    bool startsWith(const string& prefix) {
        TrieNode* curr = root;
        for (char c : prefix) {
            int idx = c - 'a';
            if (!curr->children[idx]) return false;
            curr = curr->children[idx];
        }
        return true;
    }

    // --- Count words with given prefix ---
    int countWordsWithPrefix(const string& prefix) {
        TrieNode* curr = root;
        for (char c : prefix) {
            int idx = c - 'a';
            if (!curr->children[idx]) return 0;
            curr = curr->children[idx];
        }
        return curr->prefixCount;
    }

    // --- Delete a word ---
    void erase(const string& word) {
        deleteHelper(root, word, 0);
    }
};

int main() {
    Trie trie;
    trie.insert("apple");
    trie.insert("app");
    trie.insert("apt");
    trie.insert("bat");
    trie.insert("bad");

    cout << trie.search("app") << endl;      // 1 (true)
    cout << trie.search("ap") << endl;       // 0 (false)
    cout << trie.startsWith("ap") << endl;   // 1 (true)
    cout << trie.countWordsWithPrefix("ap") << endl; // 3 (apple, app, apt)
    cout << trie.countWordsWithPrefix("ba") << endl; // 2 (bat, bad)

    trie.erase("apple");
    cout << trie.search("apple") << endl;    // 0 (false)
    cout << trie.search("app") << endl;      // 1 (true, still exists)
    return 0;
}

Memory Optimization: Using HashMap for Children

When the alphabet is very large (e.g., Unicode characters, or mixed case + digits + symbols), a fixed-size array wastes enormous memory. Use a hash map instead:

struct TrieNode {
    unordered_map<char, TrieNode*> children;
    bool isEnd = false;
    int prefixCount = 0;
};

Trade-offs:

In competitive programming, the fixed array is almost always preferred for speed.

Array-Based Trie (Memory Pool)

For competitive programming, avoid new/delete overhead with a flat array:

const int MAXNODES = 1000005; // pre-allocate
int trie[MAXNODES][26];
int cnt[MAXNODES];            // prefixCount
bool isEnd[MAXNODES];
int tot = 0;                  // next available node index

int newNode() {
    tot++;
    memset(trie[tot], 0, sizeof(trie[tot]));
    cnt[tot] = 0;
    isEnd[tot] = false;
    return tot;
}

void insert(const string& word) {
    int cur = 0; // root = 0
    for (char c : word) {
        int ch = c - 'a';
        if (!trie[cur][ch])
            trie[cur][ch] = newNode();
        cur = trie[cur][ch];
        cnt[cur]++;
    }
    isEnd[cur] = true;
}

bool search(const string& word) {
    int cur = 0;
    for (char c : word) {
        int ch = c - 'a';
        if (!trie[cur][ch]) return false;
        cur = trie[cur][ch];
    }
    return isEnd[cur];
}

This avoids dynamic allocation entirely and is much faster in practice.

Bitwise Trie (XOR Trie)

A powerful variant where we store integers bit-by-bit, from the most significant bit (MSB) down to the least significant bit (LSB). Each node has exactly 2 children (bit 0 or bit 1).

Why?

The classic application: given an array of integers, find two elements whose XOR is maximum.

Key insight: To maximize XOR, at each bit position (from MSB to LSB), we want to go in the opposite direction of the query number's bit. If the query bit is 1, we want to follow the 0 branch (so the XOR bit is 1), and vice versa.

Visual Example

Insert: 5 (101), 2 (010), 7 (111), 1 (001) -- using 3 bits for clarity:

Max XOR query for 2 (010):

Bit 2 = 0 -> want 1 -> go right to '1'
Bit 1 = 1 -> want 0 -> go left to '0'   (XOR bit = 1)
Bit 0 = 0 -> want 1 -> go right to '1'  (XOR bit = 1)
Reached 5 (101). XOR = 010 ^ 101 = 111 = 7. Maximum!

Implementation

class BitTrie {
    struct Node {
        int children[2];
        int count; // how many numbers pass through this node
        Node() : count(0) { children[0] = children[1] = -1; }
    };

    vector<Node> nodes;
    int BITS; // number of bits to consider (e.g., 30 for values up to ~10^9)

public:
    BitTrie(int bits = 30) : BITS(bits) {
        nodes.push_back(Node()); // root = node 0
    }

    void insert(int num) {
        int curr = 0;
        for (int i = BITS; i >= 0; i--) {
            int bit = (num >> i) & 1;
            if (nodes[curr].children[bit] == -1) {
                nodes[curr].children[bit] = nodes.size();
                nodes.push_back(Node());
            }
            curr = nodes[curr].children[bit];
            nodes[curr].count++;
        }
    }

    void erase(int num) {
        int curr = 0;
        for (int i = BITS; i >= 0; i--) {
            int bit = (num >> i) & 1;
            curr = nodes[curr].children[bit];
            nodes[curr].count--;
        }
    }

    // Find the maximum XOR achievable between `num` and any inserted number
    int maxXor(int num) {
        int curr = 0, result = 0;
        for (int i = BITS; i >= 0; i--) {
            int bit = (num >> i) & 1;
            int want = 1 - bit; // opposite bit maximizes XOR
            if (nodes[curr].children[want] != -1 &&
                nodes[nodes[curr].children[want]].count > 0) {
                result |= (1 << i);
                curr = nodes[curr].children[want];
            } else {
                curr = nodes[curr].children[bit];
            }
        }
        return result;
    }
};

Complexity Analysis

Where:

L = length of the word
P = length of the prefix
N = number of words
SIGMA = alphabet size (26 for lowercase English)

Space analysis in detail:

Each node uses SIGMA pointers + a few fields.
Maximum number of nodes = sum of lengths of all inserted words + 1 (for root).
In practice, prefix sharing significantly reduces node count.

For the Bitwise Trie:

Each number uses O(B) nodes where B = number of bits (typically 30 or 31).
Maximum nodes = O(N * B).

Classic Problems

1. Implement Trie (LeetCode 208)

Straightforward implementation of insert, search, startsWith.

class Trie {
    struct Node {
        Node* ch[26] = {};
        bool end = false;
    };
    Node* root = new Node();
public:
    void insert(string word) {
        Node* cur = root;
        for (char c : word) {
            if (!cur->ch[c-'a']) cur->ch[c-'a'] = new Node();
            cur = cur->ch[c-'a'];
        }
        cur->end = true;
    }
    bool search(string word) {
        Node* cur = root;
        for (char c : word) {
            if (!cur->ch[c-'a']) return false;
            cur = cur->ch[c-'a'];
        }
        return cur->end;
    }
    bool startsWith(string prefix) {
        Node* cur = root;
        for (char c : prefix) {
            if (!cur->ch[c-'a']) return false;
            cur = cur->ch[c-'a'];
        }
        return true;
    }
};

2. Word Search II (LeetCode 212) -- Trie + DFS Backtracking

Problem: Given a 2D board of characters and a list of words, find all words that can be formed by connecting adjacent cells (up/down/left/right), using each cell at most once per word.

Key insight: Build a trie from the word list. Then DFS on the board, following trie edges. This prunes the search: if the current path doesn't match any trie prefix, stop immediately.

class Solution {
    struct TrieNode {
        TrieNode* ch[26] = {};
        string* word = nullptr; // points to the actual word if this node is end
        int cnt = 0;           // number of words in subtree (for pruning)
    };

    TrieNode* root = new TrieNode();
    int dx[4] = {0, 0, 1, -1};
    int dy[4] = {1, -1, 0, 0};

    void addWord(TrieNode* node, string& w) {
        for (char c : w) {
            if (!node->ch[c-'a']) node->ch[c-'a'] = new TrieNode();
            node = node->ch[c-'a'];
            node->cnt++;
        }
        node->word = &w;
    }

    void decrement(TrieNode* node, string& w) {
        for (char c : w) {
            node = node->ch[c - 'a'];
            node->cnt--;
        }
    }

    void dfs(vector<vector<char>>& board, int i, int j, TrieNode* node,
             vector<string>& result) {
        char c = board[i][j];
        if (c == '#' || !node->ch[c - 'a'] || node->ch[c - 'a']->cnt == 0) return;

        node = node->ch[c - 'a'];
        if (node->word) {
            result.push_back(*node->word);
            // Decrement counts up the trie so the cnt==0 pruning triggers
            decrement(root, *node->word);
            node->word = nullptr;
        }

        board[i][j] = '#'; // mark visited
        for (int d = 0; d < 4; d++) {
            int ni = i + dx[d], nj = j + dy[d];
            if (ni >= 0 && ni < (int)board.size() &&
                nj >= 0 && nj < (int)board[0].size()) {
                dfs(board, ni, nj, node, result);
            }
        }
        board[i][j] = c; // unmark
    }

public:
    vector<string> findWords(vector<vector<char>>& board, vector<string>& words) {
        for (auto& w : words) addWord(root, w);
        vector<string> result;
        for (int i = 0; i < (int)board.size(); i++)
            for (int j = 0; j < (int)board[0].size(); j++)
                dfs(board, i, j, root, result);
        return result;
    }
};

3. Maximum XOR of Two Numbers in an Array (LeetCode 421)

Problem: Given an array of integers, find the maximum XOR of any two elements.

Approach: Insert all numbers into a bitwise trie. For each number, query the trie for the maximum XOR partner.

class Solution {
public:
    int findMaximumXOR(vector<int>& nums) {
        // Build bit trie
        int maxBit = 30;
        vector<array<int,2>> trie(1, {-1, -1}); // node 0 = root

        auto insert = [&](int num) {
            int cur = 0;
            for (int i = maxBit; i >= 0; i--) {
                int bit = (num >> i) & 1;
                if (trie[cur][bit] == -1) {
                    trie[cur][bit] = trie.size();
                    trie.push_back({-1, -1});
                }
                cur = trie[cur][bit];
            }
        };

        auto query = [&](int num) -> int {
            int cur = 0, res = 0;
            for (int i = maxBit; i >= 0; i--) {
                int bit = (num >> i) & 1;
                int want = 1 - bit;
                if (trie[cur][want] != -1) {
                    res |= (1 << i);
                    cur = trie[cur][want];
                } else {
                    cur = trie[cur][bit];
                }
            }
            return res;
        };

        for (int x : nums) insert(x);

        int ans = 0;
        for (int x : nums) ans = max(ans, query(x));
        return ans;
    }
};

Time: O(N * B) where B = 31 bits. Space: O(N * B).

4. Design Add and Search Words Data Structure (LeetCode 211) -- Trie with Wildcard

Problem: Support addWord(word) and search(word) where word may contain . which matches any single character.

Approach: Standard trie for addWord. For search, when we encounter ., try all 26 children recursively.

class WordDictionary {
    struct Node {
        Node* ch[26] = {};
        bool end = false;
    };
    Node* root = new Node();

    bool dfs(Node* node, const string& word, int i) {
        if (i == (int)word.size()) return node->end;
        if (word[i] == '.') {
            for (int c = 0; c < 26; c++)
                if (node->ch[c] && dfs(node->ch[c], word, i + 1))
                    return true;
            return false;
        }
        int c = word[i] - 'a';
        return node->ch[c] && dfs(node->ch[c], word, i + 1);
    }

public:
    void addWord(string word) {
        Node* cur = root;
        for (char c : word) {
            if (!cur->ch[c-'a']) cur->ch[c-'a'] = new Node();
            cur = cur->ch[c-'a'];
        }
        cur->end = true;
    }

    bool search(string word) {
        return dfs(root, word, 0);
    }
};

Time: addWord is O(L). search is O(L) without wildcards, but O(26^D * L) in the worst case where D is the number of . characters (exponential, but pruned heavily in practice).

5. Replace Words (LeetCode 648)

Problem: Given a dictionary of root words and a sentence, replace every word in the sentence with the shortest root that is a prefix of it.

Approach: Build a trie from the dictionary. For each word in the sentence, walk the trie; as soon as we hit an isEnd node, that's the shortest root.

class Solution {
public:
    string replaceWords(vector<string>& dictionary, string sentence) {
        // Build trie
        struct Node {
            Node* ch[26] = {};
            bool end = false;
        };
        Node* root = new Node();
        for (auto& w : dictionary) {
            Node* cur = root;
            for (char c : w) {
                if (!cur->ch[c-'a']) cur->ch[c-'a'] = new Node();
                cur = cur->ch[c-'a'];
            }
            cur->end = true;
        }

        // Process sentence
        istringstream iss(sentence);
        string word, result;
        bool first = true;
        while (iss >> word) {
            if (!first) result += ' ';
            first = false;
            // Find shortest prefix in trie
            Node* cur = root;
            bool found = false;
            for (int i = 0; i < (int)word.size(); i++) {
                int c = word[i] - 'a';
                if (!cur->ch[c]) break;
                cur = cur->ch[c];
                if (cur->end) {
                    result += word.substr(0, i + 1);
                    found = true;
                    break;
                }
            }
            if (!found) result += word;
        }
        return result;
    }
};

6. Longest Word in Dictionary (LeetCode 720)

Problem: Find the longest word in the dictionary such that every prefix of the word is also in the dictionary. If tie, return lexicographically smallest.

Approach: Insert all words into trie. BFS/DFS from root, only following children that are marked isEnd. The deepest such path is the answer.

class Solution {
public:
    string longestWord(vector<string>& words) {
        struct Node {
            Node* ch[26] = {};
            bool end = false;
            string word = "";
        };
        Node* root = new Node();
        for (auto& w : words) {
            Node* cur = root;
            for (char c : w) {
                if (!cur->ch[c-'a']) cur->ch[c-'a'] = new Node();
                cur = cur->ch[c-'a'];
            }
            cur->end = true;
            cur->word = w;
        }

        string ans = "";
        // BFS: only visit children marked as isEnd
        queue<Node*> q;
        q.push(root);
        while (!q.empty()) {
            Node* node = q.front(); q.pop();
            for (int i = 25; i >= 0; i--) { // reverse order for lex smallest
                if (node->ch[i] && node->ch[i]->end) {
                    if ((int)node->ch[i]->word.size() > (int)ans.size())
                        ans = node->ch[i]->word;
                    q.push(node->ch[i]);
                }
            }
        }
        return ans;
    }
};

Advanced Applications

Counting Distinct Substrings

Insert all suffixes of a string into the trie. Each node in the trie corresponds to a distinct substring. Count of distinct substrings = total number of nodes (excluding root).

For a string of length n, this takes O(n^2) time and space. For better complexity, use a suffix array + LCP array or suffix automaton.

Autocomplete System

Augment each trie node with a priority queue or frequency map of the top-k words passing through it. On each character typed, traverse the trie and return suggestions.

Compressed Trie (Patricia Trie / Radix Tree)

Compress chains of single-child nodes into one edge labeled with a string.

This reduces memory significantly for sparse tries.

When to Use a Trie

Use Case	Why Trie?
Prefix-based queries (autocomplete, spell-check)	O(L) prefix lookup, shares common prefixes
Word dictionary with prefix lookup	Better than hash set when prefix operations needed
XOR-related problems	Bitwise trie enables greedy bit-by-bit optimization
Counting distinct substrings	Each node = one distinct substring
Longest common prefix of multiple strings	LCA in the trie
Word break / dictionary matching	DFS on trie during string scan

When NOT to use a trie:

Simple "does this exact word exist?" queries -> use unordered_set (simpler, often faster).
When memory is extremely tight -> trie uses much more memory than a hash set.
When the alphabet size is huge and prefix operations aren't needed.

Summary

The trie is a fundamental structure that trades space for prefix-query efficiency. Its key strength is that operations on a string of length L take O(L) time regardless of how many strings are stored -- they don't even need to look at other strings. The bitwise variant transforms integer XOR problems into tree traversals. Mastering the trie unlocks an entire class of string and bit-manipulation problems.