Data Structure Internals

Disjoint Set Union (Union-Find)

Disjoint Set Union (Union-Find)

The Problem

We have N elements, numbered 0 to N-1, each initially in its own singleton set. We need to support two operations efficiently:

  1. Union(a, b): Merge the set containing a with the set containing b.
  2. Find(a): Determine which set a belongs to (return a "representative" or "root" element).

Two elements are in the same set if and only if Find(a) == Find(b).

Why is this useful? Anytime you need to track connected components that grow over time -- network connectivity, Kruskal's MST, equivalence classes, image segmentation, and dozens of competitive programming problems.


Real-World Analogy

Imagine a school where nobody knows each other on day one. Each student is their own "group leader." When student A befriends student B, their entire friend groups merge, and one person becomes the leader of the combined group.

  • Find(A) = "Who is the leader of A's friend group?"
  • Union(A, B) = "A and B just became friends -- merge their groups."
  • connected(A, B) = "Are A and B in the same friend group?" (i.e., Find(A) == Find(B))

The Forest Representation

We model the sets as a forest (collection of trees). Each element has a parent pointer. The root of each tree is the representative (leader) of that set, and the root's parent points to itself: parent[root] = root.

Initially (N=6): Each element is its own tree.

012 345

parent: [0, 1, 2, 3, 4, 5] (everyone is their own parent)


Naive Implementation

CPP
int parent[MAXN];

void init(int n) {
    for (int i = 0; i < n; i++)
        parent[i] = i;   // each element is its own root
}

int find(int x) {
    while (parent[x] != x)
        x = parent[x];   // walk up to root
    return x;
}

void unite(int a, int b) {
    int rootA = find(a);
    int rootB = find(b);
    if (rootA != rootB)
        parent[rootB] = rootA;  // attach B's tree under A's root
}

bool connected(int a, int b) {
    return find(a) == find(b);
}

Walkthrough

Initially: parent: [0, 1, 2, 3, 4, 5]

Union(0, 1): find(0)=0, find(1)=1, parent[1]=0

01 2345

parent: [0, 0, 2, 3, 4, 5]

Union(2, 3): find(2)=2, find(3)=3, parent[3]=2

01 23 45

parent: [0, 0, 2, 2, 4, 5]

Union(0, 2): find(0)=0, find(2)=2, parent[2]=0

0123 45

parent: [0, 0, 0, 2, 4, 5]

Union(4, 5): find(4)=4, find(5)=5, parent[5]=4

0123 45

parent: [0, 0, 0, 2, 4, 4]

Union(3, 5): find(3) = 3->2->0 = 0, find(5) = 5->4 = 4, parent[4]=0

0 124 35

parent: [0, 0, 0, 2, 0, 4]

connected(1, 5)? find(1)=0, find(5)=5->4->0=0. YES!

The Problem with Naive Implementation

The tree can become a degenerate chain (a linked list):

Union(0,1), Union(1,2), Union(2,3), Union(3,4) -- if we always attach the second tree under the first:

012 34

Find(4) takes 4 steps! In general, Find is O(n) in the worst case.

This is terrible. We need optimizations.


Optimization 1: Union by Rank (or Size)

Key Idea: When merging two trees, always attach the shorter (or smaller) tree under the taller (or larger) tree. This keeps trees balanced.

Union by Rank

rank[x] is an upper bound on the height of the subtree rooted at x.

CPP
int parent[MAXN], rnk[MAXN];

void init(int n) {
    for (int i = 0; i < n; i++) {
        parent[i] = i;
        rnk[i] = 0;     // initial height = 0 (single node)
    }
}

int find(int x) {
    while (x != parent[x])
        x = parent[x];
    return x;
}

void unite(int a, int b) {
    a = find(a);
    b = find(b);
    if (a == b) return;

    // Attach smaller-rank tree under larger-rank tree
    if (rnk[a] < rnk[b]) swap(a, b);
    parent[b] = a;
    if (rnk[a] == rnk[b]) rnk[a]++;
}

Why Does This Work?

Claim: With union by rank, the tree height is at most log2(n).

Proof sketch: A tree of rank r has at least 2^r nodes. Why?

  • A tree's rank only increases when two trees of the same rank are merged.
  • Rank 0: at least 1 = 2^0 node.
  • Rank 1: merging two rank-0 trees, at least 2 = 2^1 nodes.
  • Rank r: merging two rank-(r-1) trees, at least 2 * 2^(r-1) = 2^r nodes.

Since a tree with n nodes has rank at most log2(n), Find is O(log n).

Visual Example

Union(0,1), Union(2,3), Union(4,5), Union(6,7) -- all rank 1:

01 23 45 67

Union(0,2): rank[0]=1, rank[2]=1, equal -- attach 2 under 0, rank[0]++ = 2

0123 45 67

Union(4,6): similarly

0123 4567

Union(0,4): rank[0]=2, rank[4]=2, equal -- attach 4 under 0, rank[0]++ = 3

0 124 356 7

Height = 3 = log2(8). With naive union, height could have been 7.


Optimization 2: Path Compression

Key Idea: During Find(x), every node we visit on the path from x to the root is "compressed" -- we make it point directly to the root. Future queries on these nodes become O(1).

Before Find(5) 012 345 After Find(5) 0 1235 4

Every node on the path 5->3->2->1->0 now points directly to 0. Node 4 remains a child of 3 (it was not on the find path).

CPP
int find(int x) {
    if (x != parent[x])
        parent[x] = find(parent[x]);   // recursive path compression
    return parent[x];
}

Iterative version with path compression (two-pass):

CPP
int find(int x) {
    int root = x;
    while (root != parent[root])    // Pass 1: find the root
        root = parent[root];
    while (x != root) {             // Pass 2: compress the path
        int next = parent[x];
        parent[x] = root;
        x = next;
    }
    return root;
}

Path splitting (one-pass alternative, practical and fast):

CPP
int find(int x) {
    while (x != parent[x]) {
        parent[x] = parent[parent[x]];  // point to grandparent (path splitting)
        x = parent[x];
    }
    return x;
}

Path splitting halves the path length on every call. It is simpler than full path compression and almost as effective in practice.


Both Optimizations Together: Nearly O(1)

When we combine union by rank with path compression, each operation runs in O(alpha(n)) amortized time, where alpha is the inverse Ackermann function.

The Complete Implementation

CPP
struct DSU {
    vector<int> parent, rank_;

    DSU(int n) : parent(n), rank_(n, 0) {
        iota(parent.begin(), parent.end(), 0);  // parent[i] = i
    }

    int find(int x) {
        if (x != parent[x])
            parent[x] = find(parent[x]);  // path compression
        return parent[x];
    }

    bool unite(int a, int b) {
        a = find(a);
        b = find(b);
        if (a == b) return false;   // already in the same set

        // Union by rank
        if (rank_[a] < rank_[b]) swap(a, b);
        parent[b] = a;
        if (rank_[a] == rank_[b]) rank_[a]++;
        return true;   // merge happened
    }

    bool connected(int a, int b) {
        return find(a) == find(b);
    }
};

What is alpha(n)?

The Ackermann function A(m, n) grows astronomically fast:

A(0, n) = n + 1
A(1, n) = n + 2
A(2, n) = 2n + 3
A(3, n) = 2^(n+3) - 3
A(4, n) = 2^2^2^...^2 (tower of n+3 twos) - 3
A(5, n) = ... (incomprehensibly large)

The inverse Ackermann function alpha(n) is the smallest m such that A(m, m) >= n:

alpha(1)         = 0
alpha(4)         = 1
alpha(7)         = 2
alpha(2047)      = 3
alpha(2^2047)    = 4       (already larger than atoms in the universe: ~10^80)
alpha(A(5,5))    = 5       (a number so large it has no physical meaning)

For any input size you will ever encounter, alpha(n) <= 4. So DSU operations are effectively O(1).

Proof Sketch (Tarjan 1975)

The amortized analysis uses a potential function based on the rank of each node. The key insight is:

  1. Union by rank ensures rank[x] <= log(n).
  2. Path compression flattens trees, reducing future Find costs.
  3. The combined effect means the total cost of m operations on n elements is O(m * alpha(n)).

The full proof involves defining "blocks" based on the Ackermann function and showing that the total work done across all blocks is bounded. It is one of the most beautiful results in amortized analysis.


Weighted DSU (Maintaining Extra Information)

Often we need to track additional information about each component, such as its size, the sum of values, or the min/max element.

DSU with Component Size

CPP
struct WeightedDSU {
    vector<int> parent, sz;
    int components;

    WeightedDSU(int n) : parent(n), sz(n, 1), components(n) {
        iota(parent.begin(), parent.end(), 0);
    }

    int find(int x) {
        if (x != parent[x])
            parent[x] = find(parent[x]);
        return parent[x];
    }

    bool unite(int a, int b) {
        a = find(a);
        b = find(b);
        if (a == b) return false;

        // Union by size (attach smaller under larger)
        if (sz[a] < sz[b]) swap(a, b);
        parent[b] = a;
        sz[a] += sz[b];
        components--;
        return true;
    }

    int getSize(int x) { return sz[find(x)]; }
    int getComponents() { return components; }
    bool connected(int a, int b) { return find(a) == find(b); }
};

Note: When using union by size instead of union by rank, path compression still gives O(alpha(n)) amortized per operation. Union by size has the added benefit of sz[root] always being exactly the component size (no separate tracking needed).

DSU with Component Min/Max

CPP
struct MinMaxDSU {
    vector<int> parent, sz, mn, mx;

    MinMaxDSU(int n) : parent(n), sz(n, 1), mn(n), mx(n) {
        iota(parent.begin(), parent.end(), 0);
        iota(mn.begin(), mn.end(), 0);
        iota(mx.begin(), mx.end(), 0);
    }

    int find(int x) {
        if (x != parent[x])
            parent[x] = find(parent[x]);
        return parent[x];
    }

    bool unite(int a, int b) {
        a = find(a);
        b = find(b);
        if (a == b) return false;
        if (sz[a] < sz[b]) swap(a, b);
        parent[b] = a;
        sz[a] += sz[b];
        mn[a] = min(mn[a], mn[b]);
        mx[a] = max(mx[a], mx[b]);
        return true;
    }

    int getMin(int x) { return mn[find(x)]; }
    int getMax(int x) { return mx[find(x)]; }
};

DSU with Weighted Edges (Distance to Root)

Sometimes we need to track the relative relationship between elements, not just whether they are in the same set. For example:

  • "A is 3 units heavier than B"
  • "A is on the opposite team from B"

We maintain dist[x] = distance/weight from x to its parent. Path compression must update this.

CPP
struct WeightedEdgeDSU {
    vector<int> parent, rank_;
    vector<long long> dist;  // dist[x] = weight from x to parent[x]

    WeightedEdgeDSU(int n) : parent(n), rank_(n, 0), dist(n, 0) {
        iota(parent.begin(), parent.end(), 0);
    }

    // Returns root, and sets dist[x] = distance from x to root
    int find(int x) {
        if (x == parent[x]) return x;
        int root = find(parent[x]);
        dist[x] += dist[parent[x]];  // accumulate distance
        parent[x] = root;             // path compression
        return root;
    }

    // Unite a and b, given that weight(a) - weight(b) = w
    // i.e., "a is w units more than b"
    bool unite(int a, int b, long long w) {
        int ra = find(a), rb = find(b);
        if (ra == rb) {
            // Check consistency: dist[a] - dist[b] should == w
            return (dist[a] - dist[b] == w);
        }
        // We need: dist_new[rb] such that dist[a] + dist_new[rb->ra] ... works
        // After find: dist[a] = dist from a to ra, dist[b] = dist from b to rb
        // We want: dist[a] - dist[b] - dist_new[rb] = w (if rb goes under ra)
        // So: dist_new[rb] = dist[a] - dist[b] - w
        w = dist[a] - dist[b] - w;  // adjust w to be the edge weight rb -> ra

        if (rank_[ra] < rank_[rb]) { swap(ra, rb); w = -w; }
        parent[rb] = ra;
        dist[rb] = w;
        if (rank_[ra] == rank_[rb]) rank_[ra]++;
        return true;
    }

    // Query: weight(a) - weight(b), returns {valid, difference}
    pair<bool, long long> query(int a, int b) {
        int ra = find(a), rb = find(b);
        if (ra != rb) return {false, 0};  // different components
        return {true, dist[a] - dist[b]};
    }
};

Classic application: "Is the relationship consistent?" problems, food chain problems (POJ 1182), and potential-function based grouping.


Classic Applications

1. Kruskal's Minimum Spanning Tree Algorithm

Problem: Given a weighted undirected graph, find a spanning tree with minimum total edge weight.

Algorithm:

  1. Sort all edges by weight.
  2. For each edge (u, v, w) in sorted order: if u and v are in different components, add this edge to the MST and merge their components.
  3. Stop when we have n-1 edges.

Why DSU? We need to quickly check "are u and v already connected?" and merge their components. DSU does both in O(alpha(n)).

CPP
struct Edge {
    int u, v, w;
    bool operator<(const Edge& o) const { return w < o.w; }
};

// Returns MST weight, or -1 if graph is disconnected
long long kruskal(int n, vector<Edge>& edges) {
    sort(edges.begin(), edges.end());
    DSU dsu(n);
    long long mstWeight = 0;
    int edgesUsed = 0;

    for (auto& [u, v, w] : edges) {
        if (dsu.unite(u, v)) {
            mstWeight += w;
            edgesUsed++;
            if (edgesUsed == n - 1) break;  // MST complete
        }
    }

    return (edgesUsed == n - 1) ? mstWeight : -1;
}

Complexity: O(E log E) for sorting + O(E * alpha(V)) for DSU operations = O(E log E) total.

Walkthrough:

Graph:                    Edges sorted by weight:
   0 ---4--- 1           (0,2,1), (1,3,2), (2,3,3), (0,1,4), (1,2,5), (0,3,6)
   |  \      |
   1   6     2           Process:
   |    \    |           (0,2,1): unite(0,2) -> YES, mst += 1
   2 ---3--- 3           (1,3,2): unite(1,3) -> YES, mst += 2
    \       /            (2,3,3): unite(2,3) -> find(2)=0, find(3)=1, YES, mst += 3
     5                   (0,1,4): unite(0,1) -> find(0)=0, find(1)=0, SAME SET, skip

                          MST weight = 1 + 2 + 3 = 6, edges used = 3 = n-1.  Done!

MST edges:
   0 ------- 1
   |         |
   1         2
   |         |
   2 ---3--- 3

2. Detecting Cycles in an Undirected Graph

Problem: Given an undirected graph, determine if it contains a cycle.

Intuition: Process edges one by one. If an edge connects two nodes already in the same component, it creates a cycle.

CPP
bool hasCycle(int n, vector<pair<int,int>>& edges) {
    DSU dsu(n);
    for (auto& [u, v] : edges) {
        if (!dsu.unite(u, v)) {
            // u and v are already connected -- this edge creates a cycle!
            return true;
        }
    }
    return false;
}

Why this works:

Edges: (0,1), (1,2), (2,0)

Process (0,1): unite(0,1) -> OK, components: {0,1} {2}
Process (1,2): unite(1,2) -> OK, components: {0,1,2}
Process (2,0): unite(2,0) -> find(2)=0, find(0)=0 -> SAME!  CYCLE!

The cycle is 0 -> 1 -> 2 -> 0.

3. Number of Connected Components

CPP
int countComponents(int n, vector<pair<int,int>>& edges) {
    WeightedDSU dsu(n);
    for (auto& [u, v] : edges)
        dsu.unite(u, v);
    return dsu.getComponents();
}

4. Accounts Merge (LeetCode 721)

Problem: Given a list of accounts where each account has a name and a list of emails, merge accounts that share at least one email.

Intuition: Each email is a node. If two emails appear in the same account, union them. At the end, group all emails by their root.

CPP
vector<vector<string>> accountsMerge(vector<vector<string>>& accounts) {
    unordered_map<string, int> emailToId;
    unordered_map<string, string> emailToName;
    int id = 0;

    // Assign an ID to each unique email
    for (auto& acc : accounts) {
        string name = acc[0];
        for (int i = 1; i < acc.size(); i++) {
            if (!emailToId.count(acc[i])) {
                emailToId[acc[i]] = id++;
            }
            emailToName[acc[i]] = name;
        }
    }

    // Union emails within each account
    DSU dsu(id);
    for (auto& acc : accounts) {
        for (int i = 2; i < acc.size(); i++) {
            dsu.unite(emailToId[acc[1]], emailToId[acc[i]]);
        }
    }

    // Group emails by root
    unordered_map<int, vector<string>> groups;
    for (auto& [email, eid] : emailToId) {
        groups[dsu.find(eid)].push_back(email);
    }

    // Build result
    vector<vector<string>> result;
    for (auto& [root, emails] : groups) {
        sort(emails.begin(), emails.end());
        string name = emailToName[emails[0]];
        emails.insert(emails.begin(), name);
        result.push_back(emails);
    }
    return result;
}

5. Redundant Connection (LeetCode 684)

Problem: Given a graph that started as a tree with n nodes, one extra edge was added. Find and return that edge.

Intuition: Process edges one by one. The first edge that connects two already-connected nodes is the redundant one (it creates the cycle).

CPP
vector<int> findRedundantConnection(vector<vector<int>>& edges) {
    int n = edges.size();
    DSU dsu(n + 1);   // nodes are 1-indexed

    for (auto& e : edges) {
        if (!dsu.unite(e[0], e[1])) {
            return e;  // this edge creates a cycle
        }
    }
    return {};  // shouldn't reach here
}

6. Number of Islands II (LeetCode 305)

Problem: Given a grid, positions are turned into land one at a time. After each addition, return the number of islands.

CPP
vector<int> numIslands2(int m, int n, vector<vector<int>>& positions) {
    WeightedDSU dsu(m * n);
    vector<vector<bool>> grid(m, vector<bool>(n, false));
    int islands = 0;
    vector<int> result;
    int dx[] = {0, 0, 1, -1};
    int dy[] = {1, -1, 0, 0};

    for (auto& pos : positions) {
        int r = pos[0], c = pos[1];
        if (grid[r][c]) {
            result.push_back(islands);
            continue;
        }
        grid[r][c] = true;
        islands++;
        int id = r * n + c;

        // Try to merge with adjacent land cells
        for (int d = 0; d < 4; d++) {
            int nr = r + dx[d], nc = c + dy[d];
            if (nr >= 0 && nr < m && nc >= 0 && nc < n && grid[nr][nc]) {
                int nid = nr * n + nc;
                if (dsu.unite(id, nid)) {
                    islands--;   // two islands merged into one
                }
            }
        }
        result.push_back(islands);
    }
    return result;
}

When to Use DSU vs BFS/DFS

Use DSU when Use BFS/DFS when Only need connectivity Need the actual path Edges added dynamically Full graph is known upfront "Are A and B connected?" "What is the shortest path?" Need to merge components Need to explore / traverse Undirected graphs Directed or undirected Online algorithm Offline is fine No need for distance Need BFS distance or DFS order

Rule of thumb: If the problem says "merge," "connect," "union," "same group," or "connected component" and you don't need paths or distances, think DSU first.


Advanced: DSU with Rollback (Offline)

Sometimes we need to undo union operations -- for example, in divide-and-conquer on queries, or when processing time intervals.

Key constraint: Path compression is NOT compatible with rollback (it modifies parent pointers irreversibly in a way that's hard to undo). So we use union by rank only (no path compression), which gives O(log n) per Find.

We maintain a stack of changes and can "undo" each union by restoring the previous parent and rank values.

CPP
struct DSURollback {
    vector<int> parent, rank_;
    vector<pair<int,int>> history;  // stack of (node, oldParent) changes

    DSURollback(int n) : parent(n), rank_(n, 0) {
        iota(parent.begin(), parent.end(), 0);
    }

    int find(int x) {
        while (x != parent[x])
            x = parent[x];      // NO path compression!
        return x;
    }

    bool unite(int a, int b) {
        a = find(a);
        b = find(b);
        if (a == b) return false;

        if (rank_[a] < rank_[b]) swap(a, b);

        // Save state BEFORE modification
        history.push_back({b, parent[b]});
        history.push_back({~a, rank_[a]});  // ~a to distinguish rank changes

        parent[b] = a;
        if (rank_[a] == rank_[b]) rank_[a]++;
        return true;
    }

    int save() {
        return history.size();  // save current stack position
    }

    void rollback(int checkpoint) {
        while ((int)history.size() > checkpoint) {
            auto [node, val] = history.back();
            history.pop_back();
            if (node < 0) {
                rank_[~node] = val;   // restore rank
            } else {
                parent[node] = val;   // restore parent
            }
        }
    }

    bool connected(int a, int b) {
        return find(a) == find(b);
    }
};

// Usage:
// int cp = dsu.save();
// dsu.unite(a, b);
// dsu.unite(c, d);
// ... use the dsu ...
// dsu.rollback(cp);  // undo both unions

Applications:

  • Offline dynamic connectivity
  • Divide and conquer on queries (CDQ divide and conquer)
  • Problems where edges are added and removed but the timeline can be processed offline

Complexity: O(log n) per Find (since no path compression), O(1) per rollback step.


Advanced: DSU on Trees (Small-to-Large Merging)

This is a technique for solving tree problems where each node has a set of values, and we need to merge sets when moving from children to parents.

Key insight: Always merge the smaller set into the larger set. This ensures that each element is moved at most O(log n) times (same idea as union by size).

CPP
// Problem: For each subtree, find the most frequent value.
// dsu_on_tree template:

vector<int> color;           // color of each node
vector<vector<int>> adj;     // adjacency list
vector<int> subtreeSize;     // size of subtree rooted at each node
int cnt[MAXVAL];             // frequency of each color
int maxFreq;                 // current max frequency
vector<int> answer;          // answer for each node

void dfs(int v, int p, bool keep) {
    // Find the heavy child (largest subtree)
    int heavy = -1;
    for (int u : adj[v]) {
        if (u == p) continue;
        if (heavy == -1 || subtreeSize[u] > subtreeSize[heavy])
            heavy = u;
    }

    // Process light children first (and remove their contributions)
    for (int u : adj[v]) {
        if (u == p || u == heavy) continue;
        dfs(u, v, false);  // process and remove
    }

    // Process heavy child (keep its contribution)
    if (heavy != -1) dfs(heavy, v, true);

    // Add current node and light children's subtrees
    auto add = [&](int node) {
        cnt[color[node]]++;
        maxFreq = max(maxFreq, cnt[color[node]]);
    };

    add(v);
    for (int u : adj[v]) {
        if (u == p || u == heavy) continue;
        // Add all nodes in light child's subtree
        // (walk subtree with a DFS and call add for each node)
    }

    answer[v] = maxFreq;

    if (!keep) {
        // Remove all contributions (walk subtree and decrement cnt)
        maxFreq = 0;
    }
}

Complexity: O(n log n) total -- each node is added/removed O(log n) times.


Complexity Summary

Implementation Find Unite Naive (no optimization) O(n) O(n) Union by rank only O(log n) O(log n) Path compression only O(log n) am. O(log n) am. Both (rank + compression) O(a(n)) am. O(a(n)) am. With rollback (rank only) O(log n) O(log n) am. = amortized, a(n) = inverse Ackermann function (practically constant, <= 4)

Space: O(n) for the parent and rank/size arrays.


Common Mistakes and Pitfalls

1. Forgetting to Find the Root Before Uniting

CPP
// WRONG
void unite(int a, int b) {
    parent[b] = a;  // should be parent[find(b)] = find(a)!
}

// CORRECT
void unite(int a, int b) {
    a = find(a);
    b = find(b);
    if (a != b) parent[b] = a;
}

2. Using 1-indexed Nodes with 0-indexed DSU

CPP
// If nodes are 1-indexed, make DSU of size n+1
DSU dsu(n + 1);

3. Not Checking if Already Connected

CPP
// If you don't check, you might count a "new" edge that isn't actually new
if (dsu.unite(u, v)) {
    edgesAdded++;  // only count if a real merge happened
}

4. Modifying Rank with Path Compression

After path compression, rank[x] is no longer the exact height -- it's an upper bound. That's fine for correctness, but don't interpret it as the actual tree height.


Practice Problems

ProblemDifficultyKey Idea
Kruskal's MSTEasySort edges + DSU
Number of Connected ComponentsEasyBasic DSU
Redundant Connection (LC 684)MediumFirst edge creating a cycle
Accounts Merge (LC 721)MediumUnion emails, group by root
Number of Islands II (LC 305)HardDynamic connectivity on grid
Swim in Rising Water (LC 778)HardBinary search + DSU (or Kruskal-like)
Largest Component by Common FactorHardUnion numbers sharing prime factors
Satisfiability of Equality (LC 990)MediumUnion ==, then check !=
Making a Large Island (LC 827)HardDSU + enumerate flipping each 0
Minimize Malware Spread (LC 924)HardDSU + counting initial infections

Summary

Disjoint Set Union Two operations: Find(x) and Unite(a, b) Naive: O(n) per operation (chain degeneracy) + Union by Rank: O(log n) per operation (balanced trees) + Path Compression: O(alpha(n)) amortized (practically O(1)) Applications Kruskal's MST Cycle detection in undirected graphs Dynamic connectivity (online) Connected components counting Equivalence class problems Variants Weighted DSU (track component size / min / max / sum) Weighted edges DSU (track relative distances) DSU with rollback (for offline undo) DSU on trees (small-to-large merging)