Graph Algorithms

Strongly Connected Components (Tarjan, Kosaraju, 2-SAT)

Table of Contents

SCCs appear in dependency resolution, circuit analysis, and 2-SAT solvers. This guide covers Kosaraju's and Tarjan's algorithms with deep intuition, the condensation graph, and 2-SAT.



What is a Strongly Connected Component?

A Strongly Connected Component of a directed graph is a maximal set of vertices such that there is a path from every vertex in the set to every other vertex in the set.

Key properties:

  • Directed graphs only -- the concept is meaningless for undirected graphs (where connected components suffice).
  • Maximal -- you cannot add any more vertices and still have every pair mutually reachable.
  • Every vertex belongs to exactly one SCC (even isolated vertices form their own singleton SCC).
  • A single vertex with no self-loop is trivially an SCC of size 1.

Example Graph

Consider this directed graph with 8 vertices:

SCC 1 SCC 2 SCC 3 0 1 2 3 4 5 6 7 (mutual) cross cross cross

SCCs identified:

  • SCC 1: {0, 1, 2} -- cycle: 0 -> 1 -> 2 -> 0
  • SCC 2: {3, 4} -- cycle: 3 -> 4 -> 3
  • SCC 3: {5, 6, 7} -- cycle: 5 -> 6 -> 7 -> 5

Cross-SCC edges go from SCC 1 to SCC 2 (1->3), from SCC 1 to SCC 3 (2->5), and from SCC 2 to SCC 3 (4->6). These cross edges are one-directional -- if there were a path back, the SCCs would merge.

Why SCCs Matter

  1. Condensation: Collapsing each SCC into a single node produces a DAG. Any problem on a general directed graph can be decomposed into a DAG problem + within-SCC reasoning.
  2. Reachability: Within an SCC, every node can reach every other node. Across SCCs, reachability follows the condensation DAG.
  3. 2-SAT: The satisfiability of 2-SAT formulas is determined entirely by the SCC structure of the implication graph.

Kosaraju's Algorithm

Kosaraju's algorithm finds all SCCs using two DFS passes and the transpose graph. It is conceptually elegant and easy to implement.

The Core Idea

  1. First DFS: Run DFS on the original graph. Record the finish order (the order in which DFS completes a node -- i.e., after all descendants are processed).
  2. Reverse the graph: Build the transpose graph (reverse all edge directions).
  3. Second DFS: Process nodes in reverse finish order on the transpose graph. Each DFS tree in this pass is one SCC.

Why It Works -- Deep Intuition

This is the key insight most people miss. Let us build the reasoning step by step.

Observation 1: Finish times and reachability. In DFS, if node u can reach node v, and they are in different DFS trees or u is an ancestor of v, then u finishes after v. The finish time captures a "topological-ish" ordering.

Observation 2: The transpose preserves SCCs. If u and v are in the same SCC in graph G, they are also in the same SCC in the transpose G^T (just reverse all paths).

Observation 3: Cross-SCC edges flip direction. If SCC_A has an edge to SCC_B in G, then in G^T, SCC_B has an edge to SCC_A. The condensation DAG is reversed.

Observation 4: The second DFS peels off SCCs in topological order. The node with the latest finish time in the first DFS belongs to a "source SCC" in the condensation DAG (an SCC with no incoming cross-SCC edges). In the transpose graph, this SCC becomes a "sink" -- it has no outgoing cross-SCC edges. So when we DFS from this node in G^T, we can only reach nodes within this same SCC. We mark them, remove the SCC, and repeat with the next unvisited node in reverse finish order.

Summary of why Kosaraju works:

1. First DFS gives finish order. The last-finished node is in a "source SCC"
   of the condensation DAG.

2. In the transpose graph, that source SCC becomes a sink SCC.
   DFS from any node in a sink SCC cannot escape that SCC.

3. So second DFS (in reverse finish order, on transpose) correctly
   identifies one SCC per DFS tree.

4. After removing that SCC, the next node in reverse finish order belongs
   to the next source SCC in the remaining condensation DAG. Repeat.

Step-by-Step Walkthrough

Using the same 8-node example graph from above:

Edges: 0->1, 1->2, 2->0, 1->3, 3->4, 4->3, 2->5, 4->6, 5->6, 6->7, 7->5

Adjacency list:
0: [1]
1: [2, 3]
2: [0, 5]
3: [4]
4: [3, 6]
5: [6]
6: [7]
7: [5]

Pass 1: DFS on original graph (recording finish order)

Start DFS from node 0:
  Visit 0 -> Visit 1 -> Visit 2 -> Visit 0 (already visited)
                                  -> Visit 5 -> Visit 6 -> Visit 7 -> Visit 5 (visited)
                                      Finish 7 (order pos 0)
                                    Finish 6 (order pos 1)
                                  Finish 5 (order pos 2)
                                Finish 2 (order pos 3)
                              -> Visit 3 -> Visit 4 -> Visit 3 (visited)
                                                      -> Visit 6 (visited)
                                    Finish 4 (order pos 4)
                                  Finish 3 (order pos 5)
                                Finish 1 (order pos 6)
                              Finish 0 (order pos 7)

Finish order (first to last): [7, 6, 5, 2, 4, 3, 1, 0]
Reverse finish order:         [0, 1, 3, 4, 2, 5, 6, 7]

Build transpose graph:

Transpose adjacency list (reverse all edges):
0: [2]
1: [0]
2: [1]
3: [1, 4]
4: [3]
5: [2, 7]
6: [5, 4]
7: [6]

Pass 2: DFS on transpose in reverse finish order

Process node 0 (unvisited):
  DFS: 0 -> 2 -> 1 -> 0 (visited)
  SCC 1 = {0, 2, 1} = {0, 1, 2}

Process node 3 (unvisited):
  DFS: 3 -> 4 -> 3 (visited)
       (1 already visited)
  SCC 2 = {3, 4}

Process node 5 (unvisited):
  DFS: 5 -> 7 -> 6 -> 5 (visited)
       (2 already visited, 4 already visited)
  SCC 3 = {5, 7, 6} = {5, 6, 7}

Result: {0,1,2}, {3,4}, {5,6,7}  -- Correct!

Implementation

#include <bits/stdc++.h>
using namespace std;

class KosarajuSCC {
    int n;
    vector<vector<int>> adj, radj;
    vector<bool> visited;
    vector<int> order;        // finish order
    vector<int> comp;         // comp[v] = SCC id of vertex v

    void dfs1(int v) {
        visited[v] = true;
        for (int u : adj[v]) {
            if (!visited[u])
                dfs1(u);
        }
        order.push_back(v);   // record finish time
    }

    void dfs2(int v, int c) {
        comp[v] = c;
        for (int u : radj[v]) {
            if (comp[u] == -1)
                dfs2(u, c);
        }
    }

public:
    // Returns number of SCCs. comp[v] gives the SCC id for vertex v.
    // SCC ids are in reverse topological order of the condensation DAG:
    //   if SCC a has an edge to SCC b, then a's id < b's id.
    int findSCCs(int numNodes, vector<vector<int>>& graph) {
        n = numNodes;
        adj = graph;
        radj.assign(n, {});
        visited.assign(n, false);
        comp.assign(n, -1);
        order.clear();

        // Build transpose graph
        for (int v = 0; v < n; v++)
            for (int u : adj[v])
                radj[u].push_back(v);

        // Pass 1: DFS on original graph
        for (int v = 0; v < n; v++)
            if (!visited[v])
                dfs1(v);

        // Pass 2: DFS on transpose in reverse finish order
        int numSCCs = 0;
        for (int i = n - 1; i >= 0; i--) {
            int v = order[i];
            if (comp[v] == -1) {
                dfs2(v, numSCCs);
                numSCCs++;
            }
        }
        return numSCCs;
    }

    vector<int>& getComponents() { return comp; }

    // Get list of vertices in each SCC
    vector<vector<int>> getSCCLists() {
        int numSCCs = *max_element(comp.begin(), comp.end()) + 1;
        vector<vector<int>> sccs(numSCCs);
        for (int v = 0; v < n; v++)
            sccs[comp[v]].push_back(v);
        return sccs;
    }
};

int main() {
    int n = 8;
    vector<vector<int>> adj(n);
    adj[0] = {1};
    adj[1] = {2, 3};
    adj[2] = {0, 5};
    adj[3] = {4};
    adj[4] = {3, 6};
    adj[5] = {6};
    adj[6] = {7};
    adj[7] = {5};

    KosarajuSCC scc;
    int numSCCs = scc.findSCCs(n, adj);

    cout << "Number of SCCs: " << numSCCs << "\n";
    auto sccs = scc.getSCCLists();
    for (int i = 0; i < numSCCs; i++) {
        cout << "SCC " << i << ": ";
        for (int v : sccs[i]) cout << v << " ";
        cout << "\n";
    }
    return 0;
}

Complexity

  • Time: O(V + E) -- two DFS passes, each O(V + E), plus O(E) to build the transpose graph.
  • Space: O(V + E) -- for the transpose graph, visited array, order stack, and component labels.

Tarjan's Algorithm

Tarjan's algorithm finds all SCCs in a single DFS pass using the concept of low-link values and an explicit stack. It is more subtle than Kosaraju's but more efficient in practice (one pass, no transpose graph).

Key Concepts

Discovery time (disc[v]): The order in which DFS first visits node v.

Low-link value (low[v]): The smallest discovery time reachable from v through DFS tree edges and at most one back edge. Formally, low[v] = min discovery time of any node reachable from the subtree rooted at v that is currently on the stack.

On-stack tracking (onStack[v]): Whether node v is currently on the DFS stack. This is critical -- we only consider back edges to nodes that are still on the stack (i.e., nodes whose SCC has not yet been finalized).

Why It Works -- Deep Intuition

Core insight: A node v is the root of an SCC if and only if low[v] == disc[v].

Why? If low[v] == disc[v], it means no node in v's subtree can reach any node discovered before v (that is still on the stack). This means v and all nodes above it on the stack (up to v) form a self-contained strongly connected component -- they can all reach each other through paths within this group, but none of them can reach "upward" past v in the DFS tree.

The algorithm invariant:

1. Maintain a stack of nodes in the current DFS path and unassigned SCCs.

2. When we finish processing node v:
   - If low[v] == disc[v], then v is the root of an SCC.
     Pop everything from the stack down to and including v.
     These popped nodes form one SCC.
   - If low[v] < disc[v], then v can reach an ancestor.
     v is NOT the root. Leave it on the stack.

3. low[v] is updated by:
   - Tree edges: low[v] = min(low[v], low[child])
   - Back/cross edges to on-stack nodes: low[v] = min(low[v], disc[neighbor])
   - Edges to already-assigned nodes (not on stack): ignored

Why ignore nodes not on the stack? If a neighbor w has already been popped (assigned to an SCC), then w's SCC is fully resolved. Taking w's discovery time would incorrectly suggest we can reach ancestors through w, but we cannot -- w's SCC is a separate, finalized component.

Step-by-Step Walkthrough

Using the same 8-node graph:

Edges: 0->1, 1->2, 2->0, 1->3, 3->4, 4->3, 2->5, 4->6, 5->6, 6->7, 7->5

Adjacency list:
0: [1]
1: [2, 3]
2: [0, 5]
3: [4]
4: [3, 6]
5: [6]
6: [7]
7: [5]
DFS starts at node 0. Timer = 0.

Visit 0: disc[0]=0, low[0]=0, stack=[0]
  Visit 1: disc[1]=1, low[1]=1, stack=[0,1]
    Visit 2: disc[2]=2, low[2]=2, stack=[0,1,2]
      Edge 2->0: 0 is on stack, low[2] = min(2, disc[0]) = min(2,0) = 0
      Visit 5: disc[5]=3, low[5]=3, stack=[0,1,2,5]
        Visit 6: disc[6]=4, low[6]=4, stack=[0,1,2,5,6]
          Visit 7: disc[7]=5, low[7]=5, stack=[0,1,2,5,6,7]
            Edge 7->5: 5 is on stack, low[7] = min(5, disc[5]) = min(5,3) = 3
          Back to 7: low[7]=3 != disc[7]=5, not root. Return.
        Back to 6: low[6] = min(4, low[7]) = min(4,3) = 3
        Back to 6: low[6]=3 != disc[6]=4, not root. Return.
      Back to 5: low[5] = min(3, low[6]) = min(3,3) = 3
      Back to 5: low[5]=3 == disc[5]=3 => ROOT! Pop until 5:
        Pop 7 -> SCC
        Pop 6 -> SCC
        Pop 5 -> SCC
        SCC = {5, 6, 7}
        Stack now: [0,1,2]
    Back to 2: low[2] = min(0, --) = 0 (low was already 0 from edge 2->0)
    Back to 2: low[2]=0 != disc[2]=2, not root. Return.
  Back to 1: low[1] = min(1, low[2]) = min(1,0) = 0
    Visit 3: disc[3]=6, low[3]=6, stack=[0,1,3]
      Visit 4: disc[4]=7, low[4]=7, stack=[0,1,3,4]
        Edge 4->3: 3 is on stack, low[4] = min(7, disc[3]) = min(7,6) = 6
        Edge 4->6: 6 is NOT on stack (already assigned), skip.
      Back to 4: low[4]=6 != disc[4]=7, not root. Return.
    Back to 3: low[3] = min(6, low[4]) = min(6,6) = 6
    Back to 3: low[3]=6 == disc[3]=6 => ROOT! Pop until 3:
      Pop 4 -> SCC
      Pop 3 -> SCC
      SCC = {3, 4}
      Stack now: [0,1,2]
  Back to 1: low[1] = min(0, --) = 0 (unchanged)
  Back to 1: low[1]=0 != disc[1]=1, not root. Return.
Back to 0: low[0] = min(0, low[1]) = min(0,0) = 0
Back to 0: low[0]=0 == disc[0]=0 => ROOT! Pop until 0:
  Pop 2 -> SCC
  Pop 1 -> SCC
  Pop 0 -> SCC
  SCC = {0, 1, 2}

Final SCCs: {5,6,7}, {3,4}, {0,1,2}  -- Correct!

Detailed stack trace:

Visit 0:  disc=0 low=0  stack=[0]
Visit 1:  disc=1 low=1  stack=[0,1]
Visit 2:  disc=2 low=2  stack=[0,1,2]
  2->0: on stack, low[2]=min(2,0)=0
Visit 5:  disc=3 low=3  stack=[0,1,2,5]
Visit 6:  disc=4 low=4  stack=[0,1,2,5,6]
Visit 7:  disc=5 low=5  stack=[0,1,2,5,6,7]
  7->5: on stack, low[7]=min(5,3)=3
  Done 7: low=3 != disc=5, not root
  Back at 6: low[6]=min(4,3)=3
  Done 6: low=3 != disc=4, not root
  Back at 5: low[5]=min(3,3)=3
  Done 5: low=3 == disc=3 => ROOT!
    Pop: 7, 6, 5  => SCC = {5, 6, 7}
    stack=[0,1,2]

  Back at 2: (5 already handled, low[2] stays 0)
  Done 2: low=0 != disc=2, not root
  Back at 1: low[1]=min(1,0)=0
Visit 3:  disc=6 low=6  stack=[0,1,2,3]
Visit 4:  disc=7 low=7  stack=[0,1,2,3,4]
  4->3: on stack, low[4]=min(7,6)=6
  4->6: NOT on stack (already in SCC), skip
  Done 4: low=6 != disc=7, not root
  Back at 3: low[3]=min(6,6)=6
  Done 3: low=6 == disc=6 => ROOT!
    Pop: 4, 3  => SCC = {3, 4}
    stack=[0,1,2]

  Back at 1: low[1] stays 0
  Done 1: low=0 != disc=1, not root
  Back at 0: low[0]=min(0,0)=0
  Done 0: low=0 == disc=0 => ROOT!
    Pop: 2, 1, 0  => SCC = {0, 1, 2}
    stack=[]

Final SCCs: {5,6,7}, {3,4}, {0,1,2}  -- Correct!

Implementation

#include <bits/stdc++.h>
using namespace std;

class TarjanSCC {
    int n, timer, numSCCs;
    vector<vector<int>> adj;
    vector<int> disc, low, comp;
    vector<bool> onStack;
    stack<int> st;

    void dfs(int v) {
        disc[v] = low[v] = timer++;
        st.push(v);
        onStack[v] = true;

        for (int u : adj[v]) {
            if (disc[u] == -1) {
                // Tree edge
                dfs(u);
                low[v] = min(low[v], low[u]);
            } else if (onStack[u]) {
                // Back edge or cross edge to node in current path
                low[v] = min(low[v], disc[u]);
            }
            // If u is already assigned to an SCC (not on stack), ignore
        }

        // If v is root of an SCC
        if (low[v] == disc[v]) {
            while (true) {
                int u = st.top();
                st.pop();
                onStack[u] = false;
                comp[u] = numSCCs;
                if (u == v) break;
            }
            numSCCs++;
        }
    }

public:
    // Returns number of SCCs. comp[v] gives SCC id of vertex v.
    // SCC ids are in reverse topological order of the condensation DAG.
    int findSCCs(int numNodes, vector<vector<int>>& graph) {
        n = numNodes;
        adj = graph;
        timer = 0;
        numSCCs = 0;
        disc.assign(n, -1);
        low.assign(n, -1);
        comp.assign(n, -1);
        onStack.assign(n, false);

        for (int v = 0; v < n; v++)
            if (disc[v] == -1)
                dfs(v);

        return numSCCs;
    }

    vector<int>& getComponents() { return comp; }

    vector<vector<int>> getSCCLists() {
        vector<vector<int>> sccs(numSCCs);
        for (int v = 0; v < n; v++)
            sccs[comp[v]].push_back(v);
        return sccs;
    }
};

int main() {
    int n = 8;
    vector<vector<int>> adj(n);
    adj[0] = {1};
    adj[1] = {2, 3};
    adj[2] = {0, 5};
    adj[3] = {4};
    adj[4] = {3, 6};
    adj[5] = {6};
    adj[6] = {7};
    adj[7] = {5};

    TarjanSCC scc;
    int numSCCs = scc.findSCCs(n, adj);

    cout << "Number of SCCs: " << numSCCs << "\n";
    auto sccs = scc.getSCCLists();
    for (int i = 0; i < numSCCs; i++) {
        cout << "SCC " << i << ": ";
        for (int v : sccs[i]) cout << v << " ";
        cout << "\n";
    }
    return 0;
}

Complexity

  • Time: O(V + E) -- single DFS pass, each edge and vertex processed once.
  • Space: O(V) -- for disc, low, comp arrays, on-stack flags, and the explicit stack.

Subtle Point: disc[u] vs low[u] in Back Edges

When we encounter a back edge to an on-stack node u, we write:

low[v] = min(low[v], disc[u]);   // Tarjan's original formulation

Some implementations use low[u] instead:

low[v] = min(low[v], low[u]);    // Also correct for finding SCCs

Both correctly identify SCC roots, but disc[u] is the standard Tarjan's formulation. Using low[u] can give different low-link values but still yields the same SCCs because the root condition low[v] == disc[v] works either way.


Comparison: Kosaraju vs Tarjan

Aspect Kosaraju Tarjan DFS passes Two (original + transpose) One Extra graph Needs transpose graph No extra graph Time complexity O(V + E) O(V + E) Space complexity O(V + E) for transpose O(V) (no extra graph) Ease of understanding Easier to grasp intuitively Requires low-link reasoning Ease of coding Slightly simpler More bookkeeping SCC order Topological order of DAG Reverse topological order Also finds bridges/ articulation points? No Yes (with minor modifications)

When to Use Which

  • Kosaraju's: When you want simplicity and clarity, or when you also need the transpose graph for other purposes. Great for interviews when you need to explain your approach clearly.
  • Tarjan's: When memory matters (no transpose graph), when you need a single-pass solution, or when you also want to find bridges/articulation points. Preferred in competitive programming.
  • Both give SCCs in (reverse) topological order of the condensation DAG, which is useful for building the condensation.

Condensation Graph

The condensation of a directed graph is the DAG formed by contracting each SCC into a single node. Cross-SCC edges become edges in the condensation.

Why It Matters

The condensation transforms any directed graph into a DAG. This unlocks all DAG techniques:

  • Topological sort
  • DP on DAGs (longest path, counting paths, etc.)
  • Reachability queries

Visual: From Graph to Condensation

Original Graph SCC 0 0 1 2 SCC 1 3 4 SCC 2 5 6 7 --> condense Condensation DAG C0 {0,1,2} C1 {3,4} C2 {5,6,7}

Building the Condensation

#include <bits/stdc++.h>
using namespace std;

// After running Kosaraju or Tarjan to get comp[v] for each vertex
// and numSCCs total components:
struct Condensation {
    int numNodes;                       // number of super-nodes (SCCs)
    vector<vector<int>> adj;            // adjacency list of the DAG
    vector<int> sccSize;                // number of vertices in each SCC
    vector<vector<int>> sccMembers;     // vertices in each SCC

    void build(int n, vector<vector<int>>& originalAdj,
               vector<int>& comp, int numSCCs) {
        numNodes = numSCCs;
        adj.assign(numSCCs, {});
        sccSize.assign(numSCCs, 0);
        sccMembers.assign(numSCCs, {});

        // Record SCC membership
        for (int v = 0; v < n; v++) {
            sccSize[comp[v]]++;
            sccMembers[comp[v]].push_back(v);
        }

        // Build DAG edges (deduplicate with a set)
        set<pair<int,int>> dagEdges;
        for (int v = 0; v < n; v++) {
            for (int u : originalAdj[v]) {
                if (comp[v] != comp[u]) {
                    dagEdges.insert({comp[v], comp[u]});
                }
            }
        }

        for (auto [a, b] : dagEdges) {
            adj[a].push_back(b);
        }
    }
};

Application: DP on Condensation DAG

A common pattern: assign values to nodes, collapse SCCs (all nodes in an SCC share their combined value), then run DP on the DAG.

Example: Find the maximum sum of node values you can collect starting from any node and following directed edges.

// After building condensation with sccSize[] and DAG adj[]:
// Suppose each original node has value val[v].
// sccVal[c] = sum of val[v] for all v in SCC c.

vector<long long> sccVal(numSCCs, 0);
for (int v = 0; v < n; v++)
    sccVal[comp[v]] += val[v];

// Topological sort the condensation DAG
vector<int> indeg(numSCCs, 0);
for (int c = 0; c < numSCCs; c++)
    for (int d : dagAdj[c])
        indeg[d]++;

queue<int> q;
for (int c = 0; c < numSCCs; c++)
    if (indeg[c] == 0)
        q.push(c);

vector<long long> dp(numSCCs);
for (int c = 0; c < numSCCs; c++)
    dp[c] = sccVal[c];

// Process in topological order
vector<int> topoOrder;
while (!q.empty()) {
    int c = q.front(); q.pop();
    topoOrder.push_back(c);
    for (int d : dagAdj[c]) {
        dp[d] = max(dp[d], dp[c] + sccVal[d]);
        if (--indeg[d] == 0)
            q.push(d);
    }
}

long long answer = *max_element(dp.begin(), dp.end());

Complexity

  • Building condensation: O(V + E)
  • DP on condensation: O(V_scc + E_scc) where V_scc and E_scc are vertices and edges of the condensation DAG.

2-SAT (Boolean Satisfiability)

2-SAT is a special case of the Boolean satisfiability problem where each clause has exactly 2 literals. It can be solved in polynomial time using SCCs -- a beautiful connection between logic and graph theory.

The Problem

Given n boolean variables x_1, x_2, ..., x_n and m clauses, each of the form (a OR b) where a, b are literals (x_i or NOT x_i), determine if there is an assignment of true/false to all variables that satisfies all clauses simultaneously.

The Implication Graph

Every clause (a OR b) is equivalent to two implications:

(a OR b)  is equivalent to  (NOT a => b) AND (NOT b => a)

Examples:
  (x1 OR x2)       =>  (!x1 => x2) AND (!x2 => x1)
  (x1 OR !x3)      =>  (!x1 => !x3) AND (x3 => x1)
  (!x2 OR !x3)     =>  (x2 => !x3) AND (x3 => !x2)

We build a directed graph (the implication graph) with 2n nodes: for each variable x_i, we have nodes x_i (true) and !x_i (false). For each clause, we add the two implication edges.

Implication Graph for (x1 OR x2) AND (x1 OR !x3) AND (!x2 OR !x3) x1 !x1 x2 !x2 x3 !x3 !x1=>x2 !x2=>x1 !x1=>!x3 x3=>x1 x2=>!x3 x3=>!x2

The SCC Connection

Theorem: A 2-SAT formula is satisfiable if and only if no variable x_i and its negation !x_i are in the same SCC.

Why? If x_i and !x_i are in the same SCC, there is a path from x_i to !x_i AND from !x_i to x_i. This means x_i => !x_i and !x_i => x_i, which is a contradiction (x_i must be both true and false).

Extracting the Assignment

Once we know the formula is satisfiable, we extract the assignment using the topological order of the condensation:

Rule: For each variable x_i, if comp[x_i] > comp[!x_i] (in Kosaraju's topological numbering, where higher id = later in topological order), set x_i = true. Otherwise, set x_i = false. (If using Tarjan's, the comparison is reversed since Tarjan assigns ids in reverse topological order.)

Intuition: In the condensation DAG, implications flow forward. If !x_i's SCC comes before x_i's SCC in topological order, then the chain of implications leads from "false" to "true", meaning we should assign x_i = true (the implications push us toward truth).

Complete 2-SAT Solver

#include <bits/stdc++.h>
using namespace std;

class TwoSAT {
    int n;                          // number of variables (0-indexed)
    vector<vector<int>> adj, radj;
    vector<int> order, comp;
    vector<bool> visited;

    void dfs1(int v) {
        visited[v] = true;
        for (int u : adj[v])
            if (!visited[u])
                dfs1(u);
        order.push_back(v);
    }

    void dfs2(int v, int c) {
        comp[v] = c;
        for (int u : radj[v])
            if (comp[u] == -1)
                dfs2(u, c);
    }

public:
    // n = number of boolean variables (0-indexed: x_0, x_1, ..., x_{n-1})
    // Node encoding: variable x_i -> node 2*i (true), node 2*i+1 (false/negation)
    TwoSAT(int n) : n(n), adj(2 * n), radj(2 * n) {}

    // Helper: given a literal, return its graph node index.
    // Literal i (positive) -> 2*i, literal ~i (negative) -> 2*i+1
    // In practice, use addClause with the raw variable index and a sign.

    // Add clause (a OR b).
    // a is encoded as: variable index + sign. Pass (var, true) for x_var,
    // (var, false) for !x_var.
    void addClause(int u, bool nu, int v, bool nv) {
        // (u=nu OR v=nv)
        // !nu => nv  AND  !nv => nu
        // node for (u, nu): 2*u + (nu ? 0 : 1)
        // node for (u, !nu): 2*u + (nu ? 1 : 0)
        int a = 2 * u + (nu ? 0 : 1);   // node for literal u=nu
        int na = 2 * u + (nu ? 1 : 0);  // node for literal u=!nu
        int b = 2 * v + (nv ? 0 : 1);   // node for literal v=nv
        int nb = 2 * v + (nv ? 1 : 0);  // node for literal v=!nv

        // Implication: !a => b  (if a is false, b must be true)
        adj[na].push_back(b);
        radj[b].push_back(na);

        // Implication: !b => a  (if b is false, a must be true)
        adj[nb].push_back(a);
        radj[a].push_back(nb);
    }

    // Add constraint: variable u must be val (true/false).
    // This is equivalent to the clause (u=val OR u=val).
    void setTrue(int u, bool val) {
        addClause(u, val, u, val);
    }

    // Solve: returns true if satisfiable, fills 'result' with assignment.
    bool solve(vector<bool>& result) {
        int N = 2 * n;
        visited.assign(N, false);
        order.clear();
        comp.assign(N, -1);

        // Kosaraju pass 1
        for (int i = 0; i < N; i++)
            if (!visited[i])
                dfs1(i);

        // Kosaraju pass 2
        int numSCCs = 0;
        for (int i = N - 1; i >= 0; i--) {
            int v = order[i];
            if (comp[v] == -1)
                dfs2(v, numSCCs++);
        }

        // Check satisfiability
        result.assign(n, false);
        for (int i = 0; i < n; i++) {
            if (comp[2 * i] == comp[2 * i + 1])
                return false;  // x_i and !x_i in same SCC => contradiction
            // In Kosaraju, earlier SCC id = earlier in topological order.
            // If comp[2i] > comp[2i+1], then x_i's SCC comes later
            // (topologically after !x_i), so x_i = true.
            result[i] = comp[2 * i] > comp[2 * i + 1];
        }
        return true;
    }
};

int main() {
    // Example: (x0 OR x1) AND (x0 OR !x2) AND (!x1 OR !x2)
    // Variables: x0, x1, x2
    TwoSAT sat(3);
    sat.addClause(0, true, 1, true);    // x0 OR x1
    sat.addClause(0, true, 2, false);   // x0 OR !x2
    sat.addClause(1, false, 2, false);  // !x1 OR !x2

    vector<bool> result;
    if (sat.solve(result)) {
        cout << "Satisfiable!\n";
        for (int i = 0; i < 3; i++)
            cout << "x" << i << " = " << (result[i] ? "true" : "false") << "\n";
    } else {
        cout << "Unsatisfiable\n";
    }
    // Output: x0 = true, x1 = true, x2 = false (one valid assignment)
    return 0;
}

Walkthrough Example

Formula: (x0 OR x1) AND (x0 OR !x2) AND (!x1 OR !x2)

Implications:
  Clause 1: !x0 => x1,   !x1 => x0
  Clause 2: !x0 => !x2,   x2 => x0
  Clause 3:  x1 => !x2,   x2 => !x1

Implication graph (6 nodes: x0, !x0, x1, !x1, x2, !x2):

  !x0 -> x1, !x2
  !x1 -> x0
   x1 -> !x2
   x2 -> x0, !x1
  !x2 -> (no outgoing)
   x0 -> (no outgoing)

SCCs: Each node is its own SCC (no cycles).
Check: x0 and !x0 in different SCCs? Yes.
       x1 and !x1 in different SCCs? Yes.
       x2 and !x2 in different SCCs? Yes.
=> Satisfiable!

Assignment (using topological order):
  x0: comp[x0] > comp[!x0]?  x0 is a sink (later topo order) => true
  x1: comp[x1] > comp[!x1]?  x1 reachable from !x0 => true
  x2: comp[x2] > comp[!x2]?  !x2 is the sink => x2 = false

Result: x0=true, x1=true, x2=false
Verify: (T OR T)=T, (T OR T)=T, (F OR T)=T => All satisfied!

Complexity

  • Time: O(n + m) where n = number of variables, m = number of clauses. The implication graph has 2n nodes and 2m edges.
  • Space: O(n + m)

Classic Problems with Full Solutions

Problem 1: Number of Strongly Connected Components

Problem: Given a directed graph with V vertices and E edges, find the number of SCCs.

This is the direct application of Kosaraju's or Tarjan's algorithm.

#include <bits/stdc++.h>
using namespace std;

int main() {
    int V, E;
    cin >> V >> E;

    vector<vector<int>> adj(V), radj(V);
    for (int i = 0; i < E; i++) {
        int u, v;
        cin >> u >> v;
        adj[u].push_back(v);
        radj[v].push_back(u);
    }

    // Kosaraju's algorithm
    vector<bool> visited(V, false);
    vector<int> order;

    // Pass 1: DFS on original graph
    function<void(int)> dfs1 = [&](int v) {
        visited[v] = true;
        for (int u : adj[v])
            if (!visited[u])
                dfs1(u);
        order.push_back(v);
    };

    for (int i = 0; i < V; i++)
        if (!visited[i])
            dfs1(i);

    // Pass 2: DFS on transpose in reverse finish order
    vector<int> comp(V, -1);
    int numSCCs = 0;

    function<void(int, int)> dfs2 = [&](int v, int c) {
        comp[v] = c;
        for (int u : radj[v])
            if (comp[u] == -1)
                dfs2(u, c);
    };

    for (int i = V - 1; i >= 0; i--) {
        if (comp[order[i]] == -1) {
            dfs2(order[i], numSCCs);
            numSCCs++;
        }
    }

    cout << numSCCs << "\n";
    return 0;
}

Complexity: Time O(V + E), Space O(V + E).


Problem 2: Critical Connections (Bridges) in a Graph (LeetCode 1192)

Problem: Given an undirected graph, find all bridges -- edges whose removal disconnects the graph.

While this problem is about undirected graphs and bridges (not SCCs directly), it uses the same low-link technique from Tarjan's algorithm. A bridge is an edge (u, v) where low[v] > disc[u] -- meaning v cannot reach u or any of u's ancestors without using the edge (u, v).

#include <bits/stdc++.h>
using namespace std;

class Solution {
public:
    vector<vector<int>> criticalConnections(int n, vector<vector<int>>& connections) {
        vector<vector<int>> adj(n);
        for (auto& e : connections) {
            adj[e[0]].push_back(e[1]);
            adj[e[1]].push_back(e[0]);
        }

        vector<int> disc(n, -1), low(n, -1);
        vector<vector<int>> bridges;
        int timer = 0;

        function<void(int, int)> dfs = [&](int u, int parent) {
            disc[u] = low[u] = timer++;

            for (int v : adj[u]) {
                if (v == parent) continue;  // skip the edge we came from

                if (disc[v] == -1) {
                    // Tree edge
                    dfs(v, u);
                    low[u] = min(low[u], low[v]);

                    // Bridge condition: v cannot reach u or above
                    if (low[v] > disc[u]) {
                        bridges.push_back({u, v});
                    }
                } else {
                    // Back edge
                    low[u] = min(low[u], disc[v]);
                }
            }
        };

        for (int i = 0; i < n; i++)
            if (disc[i] == -1)
                dfs(i, -1);

        return bridges;
    }
};

int main() {
    int n = 4;
    vector<vector<int>> connections = {{0,1},{1,2},{2,0},{1,3}};
    Solution sol;
    auto bridges = sol.criticalConnections(n, connections);
    for (auto& b : bridges)
        cout << b[0] << " - " << b[1] << "\n";
    // Output: 1 - 3  (removing edge 1-3 disconnects node 3)
    return 0;
}

Why this is related to SCCs: In a directed graph, bridges (in the undirected sense) are edges that, if removed, increase the number of connected components. In the directed version, Tarjan's low-link values detect exactly when a subtree has no back edge reaching above -- the same principle underlying both bridge detection and SCC detection.

Complexity: Time O(V + E), Space O(V + E).


Problem 3: Course Schedule II / Detect Cycle in Directed Graph (LeetCode 210)

Problem: Given numCourses and prerequisite pairs, return an ordering of courses you should take to finish all courses. If impossible (cycle exists), return empty.

A cycle in a directed graph means at least one SCC has size > 1. If all SCCs are singletons, the graph is a DAG and a topological order exists.

#include <bits/stdc++.h>
using namespace std;

class Solution {
public:
    vector<int> findOrder(int numCourses, vector<vector<int>>& prerequisites) {
        int n = numCourses;
        vector<vector<int>> adj(n), radj(n);

        for (auto& p : prerequisites) {
            // p[0] depends on p[1]: edge from p[1] to p[0]
            adj[p[1]].push_back(p[0]);
            radj[p[0]].push_back(p[1]);
        }

        // Kosaraju to find SCCs
        vector<bool> visited(n, false);
        vector<int> order;

        function<void(int)> dfs1 = [&](int v) {
            visited[v] = true;
            for (int u : adj[v])
                if (!visited[u])
                    dfs1(u);
            order.push_back(v);
        };

        for (int i = 0; i < n; i++)
            if (!visited[i])
                dfs1(i);

        vector<int> comp(n, -1);
        vector<int> sccSize;
        int numSCCs = 0;

        function<void(int, int)> dfs2 = [&](int v, int c) {
            comp[v] = c;
            for (int u : radj[v])
                if (comp[u] == -1)
                    dfs2(u, c);
        };

        for (int i = n - 1; i >= 0; i--) {
            if (comp[order[i]] == -1) {
                dfs2(order[i], numSCCs);
                numSCCs++;
            }
        }

        // Check for cycles: any SCC with size > 1 means a cycle
        vector<int> sizes(numSCCs, 0);
        for (int i = 0; i < n; i++)
            sizes[comp[i]]++;

        for (int s : sizes)
            if (s > 1)
                return {};  // Cycle detected

        // Also check for self-loops
        for (int v = 0; v < n; v++)
            for (int u : adj[v])
                if (u == v)
                    return {};

        // No cycle => topological order is the reverse finish order from pass 1
        vector<int> result(order.rbegin(), order.rend());
        return result;
    }
};

int main() {
    int numCourses = 4;
    vector<vector<int>> prereqs = {{1,0},{2,0},{3,1},{3,2}};
    Solution sol;
    auto order = sol.findOrder(numCourses, prereqs);
    if (order.empty()) {
        cout << "Impossible (cycle detected)\n";
    } else {
        for (int c : order) cout << c << " ";
        cout << "\n";
    }
    // Output: 0 1 2 3 (or 0 2 1 3)
    return 0;
}

Note: For this specific problem, Kahn's algorithm (BFS topological sort) is simpler. But the SCC approach generalizes -- it tells you exactly which courses form cycles, which is useful for debugging and follow-up questions.

Complexity: Time O(V + E), Space O(V + E).


Problem 4: 2-SAT Application -- Party Invitation

Problem: You are organizing a party with n pairs of people. From each pair, you must invite exactly one person. Some people have conflicts: certain pairs of people cannot both attend. Determine if a valid invitation list exists, and if so, find one.

This is a classic 2-SAT problem. For each pair i, let x_i = true mean "invite person A from pair i" and x_i = false mean "invite person B from pair i". A conflict between person A of pair i and person B of pair j translates to the clause: "NOT(x_i AND !x_j)" which is "(!x_i OR x_j)".

#include <bits/stdc++.h>
using namespace std;

class TwoSAT {
    int n;
    vector<vector<int>> adj, radj;
    vector<int> order, comp;
    vector<bool> visited;

    void dfs1(int v) {
        visited[v] = true;
        for (int u : adj[v])
            if (!visited[u]) dfs1(u);
        order.push_back(v);
    }

    void dfs2(int v, int c) {
        comp[v] = c;
        for (int u : radj[v])
            if (comp[u] == -1) dfs2(u, c);
    }

public:
    TwoSAT(int n) : n(n), adj(2 * n), radj(2 * n) {}

    void addClause(int u, bool nu, int v, bool nv) {
        int a = 2 * u + (nu ? 0 : 1);
        int na = 2 * u + (nu ? 1 : 0);
        int b = 2 * v + (nv ? 0 : 1);
        int nb = 2 * v + (nv ? 1 : 0);
        adj[na].push_back(b);
        radj[b].push_back(na);
        adj[nb].push_back(a);
        radj[a].push_back(nb);
    }

    bool solve(vector<bool>& result) {
        int N = 2 * n;
        visited.assign(N, false);
        order.clear();
        comp.assign(N, -1);
        for (int i = 0; i < N; i++)
            if (!visited[i]) dfs1(i);
        int numSCCs = 0;
        for (int i = N - 1; i >= 0; i--)
            if (comp[order[i]] == -1) dfs2(order[i], numSCCs++);
        result.assign(n, false);
        for (int i = 0; i < n; i++) {
            if (comp[2 * i] == comp[2 * i + 1]) return false;
            result[i] = comp[2 * i] > comp[2 * i + 1];
        }
        return true;
    }
};

int main() {
    // 3 pairs: (A0,B0), (A1,B1), (A2,B2)
    // Conflicts: A0 and B1 cannot both attend, A1 and A2 cannot both attend
    int numPairs = 3;
    TwoSAT sat(numPairs);

    // Conflict: A0 and B1 can't both come
    // A0 = x0=true, B1 = x1=false
    // Clause: NOT(x0 AND !x1) = (!x0 OR x1)
    sat.addClause(0, false, 1, true);  // !x0 OR x1

    // Conflict: A1 and A2 can't both come
    // A1 = x1=true, A2 = x2=true
    // Clause: NOT(x1 AND x2) = (!x1 OR !x2)
    sat.addClause(1, false, 2, false);  // !x1 OR !x2

    vector<bool> result;
    if (sat.solve(result)) {
        cout << "Valid invitation:\n";
        for (int i = 0; i < numPairs; i++)
            cout << "Pair " << i << ": invite person "
                 << (result[i] ? "A" : "B") << "\n";
    } else {
        cout << "No valid invitation exists\n";
    }
    return 0;
}

Complexity: Time O(n + m), Space O(n + m) where m = number of conflict constraints.


Problem 5: Maximum Groups That Can Communicate (SCC + Condensation)

Problem: In a directed communication network, a group of people can all communicate if they form an SCC (every person can reach every other). Some groups can send messages to other groups (cross-SCC edges). Find: (a) the number of groups (SCCs), and (b) the minimum number of edges to add so that everyone can communicate with everyone (i.e., make the entire graph one single SCC).

Key insight for part (b): After condensation, we have a DAG. To make a DAG strongly connected, we need to add max(sources, sinks) edges, where sources are nodes with in-degree 0 and sinks are nodes with out-degree 0 in the condensation DAG. (Special case: if there is only 1 SCC already, the answer is 0.)

Condensation DAG C0 source C1 source C2 C3 sink C4 sink Sources = 2 (C0, C1), Sinks = 2 (C3, C4) Edges to add = max(2, 2) = 2
#include <bits/stdc++.h>
using namespace std;

int main() {
    int V, E;
    cin >> V >> E;

    vector<vector<int>> adj(V), radj(V);
    for (int i = 0; i < E; i++) {
        int u, v;
        cin >> u >> v;
        adj[u].push_back(v);
        radj[v].push_back(u);
    }

    // Kosaraju's SCC
    vector<bool> visited(V, false);
    vector<int> order;

    function<void(int)> dfs1 = [&](int v) {
        visited[v] = true;
        for (int u : adj[v])
            if (!visited[u]) dfs1(u);
        order.push_back(v);
    };

    for (int i = 0; i < V; i++)
        if (!visited[i]) dfs1(i);

    vector<int> comp(V, -1);
    int numSCCs = 0;

    function<void(int, int)> dfs2 = [&](int v, int c) {
        comp[v] = c;
        for (int u : radj[v])
            if (comp[u] == -1) dfs2(u, c);
    };

    for (int i = V - 1; i >= 0; i--)
        if (comp[order[i]] == -1)
            dfs2(order[i], numSCCs++);

    cout << "Number of SCCs (groups): " << numSCCs << "\n";

    if (numSCCs == 1) {
        cout << "Already fully connected. Edges to add: 0\n";
        return 0;
    }

    // Build condensation and count sources/sinks
    vector<int> indeg(numSCCs, 0), outdeg(numSCCs, 0);
    set<pair<int,int>> dagEdges;

    for (int v = 0; v < V; v++) {
        for (int u : adj[v]) {
            if (comp[v] != comp[u] && dagEdges.insert({comp[v], comp[u]}).second) {
                outdeg[comp[v]]++;
                indeg[comp[u]]++;
            }
        }
    }

    int sources = 0, sinks = 0;
    for (int i = 0; i < numSCCs; i++) {
        if (indeg[i] == 0) sources++;
        if (outdeg[i] == 0) sinks++;
    }

    int edgesToAdd = max(sources, sinks);
    cout << "Edges to add for full connectivity: " << edgesToAdd << "\n";

    return 0;
}

Why max(sources, sinks)? Each added edge can eliminate at most one source and one sink. We need to eliminate all sources and all sinks, so we need at least max(sources, sinks) edges. A constructive proof shows this is also sufficient: connect each sink to a source in a round-robin fashion.

Complexity: Time O(V + E), Space O(V + E).


Problem 6: Satisfiability of Equality Equations (LeetCode 990) via SCC Thinking

Problem: Given an array of equations like ["a==b","b!=c","c==a"], determine if all equations can be simultaneously satisfied.

While this is typically solved with Union-Find, the SCC perspective provides insight: equality constraints form undirected edges (symmetric implications), and inequality constraints add forbidden pairs. If a forbidden pair ends up in the same connected component (SCC in the undirected sense), the answer is false.

#include <bits/stdc++.h>
using namespace std;

class Solution {
public:
    bool equationsPossible(vector<string>& equations) {
        // Union-Find (DSU) approach, motivated by SCC thinking
        vector<int> parent(26);
        iota(parent.begin(), parent.end(), 0);

        function<int(int)> find = [&](int x) {
            return parent[x] == x ? x : parent[x] = find(parent[x]);
        };

        // First pass: process all equality constraints (== edges)
        for (auto& eq : equations) {
            if (eq[1] == '=') {
                int a = eq[0] - 'a', b = eq[3] - 'a';
                parent[find(a)] = find(b);
            }
        }

        // Second pass: check inequality constraints
        for (auto& eq : equations) {
            if (eq[1] == '!') {
                int a = eq[0] - 'a', b = eq[3] - 'a';
                if (find(a) == find(b))
                    return false;  // Contradiction: a and b forced equal but also unequal
            }
        }

        return true;
    }
};

int main() {
    vector<string> eq1 = {"a==b", "b!=c", "c==a"};
    vector<string> eq2 = {"a==b", "b==c", "a!=c"};

    Solution sol;
    cout << (sol.equationsPossible(eq1) ? "true" : "false") << "\n";  // false
    cout << (sol.equationsPossible(eq2) ? "true" : "false") << "\n";  // false
    return 0;
}

Connection to SCCs: In a directed implication graph, a == b would be modeled as a => b AND b => a, placing them in the same SCC. An inequality a != b says they must be in different SCCs. This is conceptually identical to how 2-SAT works.

Complexity: Time O(n * alpha(26)) which is effectively O(n), Space O(1).


Problem 7: Minimum Edges to Make All Nodes Reachable from a Single Source

Problem: Given a directed graph, find the minimum number of edges to add so that all nodes are reachable from node 0.

Approach: Find SCCs, build condensation DAG. Count the number of source nodes (in-degree 0) in the condensation DAG, excluding the SCC that contains node 0. The answer is the number of such sources (we need to add an edge from some reachable node to each unreachable source SCC).

#include <bits/stdc++.h>
using namespace std;

int main() {
    int V, E;
    cin >> V >> E;

    vector<vector<int>> adj(V), radj(V);
    for (int i = 0; i < E; i++) {
        int u, v;
        cin >> u >> v;
        adj[u].push_back(v);
        radj[v].push_back(u);
    }

    // Kosaraju's SCC
    vector<bool> visited(V, false);
    vector<int> order;
    function<void(int)> dfs1 = [&](int v) {
        visited[v] = true;
        for (int u : adj[v])
            if (!visited[u]) dfs1(u);
        order.push_back(v);
    };
    for (int i = 0; i < V; i++)
        if (!visited[i]) dfs1(i);

    vector<int> comp(V, -1);
    int numSCCs = 0;
    function<void(int, int)> dfs2 = [&](int v, int c) {
        comp[v] = c;
        for (int u : radj[v])
            if (comp[u] == -1) dfs2(u, c);
    };
    for (int i = V - 1; i >= 0; i--)
        if (comp[order[i]] == -1)
            dfs2(order[i], numSCCs++);

    // Build condensation: find in-degree of each SCC
    vector<int> indeg(numSCCs, 0);
    set<pair<int,int>> dagEdges;
    for (int v = 0; v < V; v++)
        for (int u : adj[v])
            if (comp[v] != comp[u] && dagEdges.insert({comp[v], comp[u]}).second)
                indeg[comp[u]]++;

    // Count source SCCs excluding the one containing node 0
    int sourceSCC = comp[0];
    int answer = 0;
    for (int i = 0; i < numSCCs; i++)
        if (indeg[i] == 0 && i != sourceSCC)
            answer++;

    cout << "Minimum edges to add: " << answer << "\n";
    return 0;
}

Complexity: Time O(V + E), Space O(V + E).


Common Mistakes & Interview Tips

1. Forgetting That SCCs Only Apply to Directed Graphs

For undirected graphs, use connected components (BFS/DFS) or biconnected components. SCCs are a directed graph concept. If the interviewer gives you an undirected graph and asks about "strongly connected components," clarify -- they likely mean connected components or biconnected components.

2. Mixing Up disc[u] and low[u] in Tarjan's

Wrong:

// Using low[u] for back edges to nodes on stack
low[v] = min(low[v], low[u]);  // Works for SCC detection but gives wrong low values

Right (Tarjan's original):

if (onStack[u]) {
    low[v] = min(low[v], disc[u]);  // Use discovery time, not low-link
}

Both identify SCCs correctly, but using disc[u] is the standard formulation and is important if you later want to extend the algorithm for bridge/articulation point detection.

3. Not Tracking the On-Stack Flag in Tarjan's

Wrong:

if (disc[u] != -1) {
    low[v] = min(low[v], disc[u]);  // BUG: u might already be in a finished SCC
}

Right:

if (disc[u] != -1 && onStack[u]) {
    low[v] = min(low[v], disc[u]);  // Only consider nodes still on the stack
}

Without onStack, you might incorrectly link to a node that belongs to an already-finalized SCC, producing wrong results.

4. Stack Overflow on Large Graphs

Both Kosaraju's and Tarjan's use recursive DFS, which can overflow the stack for graphs with 100K+ nodes. Solutions:

// Option 1: Increase stack size (platform-specific)
// On Linux: ulimit -s unlimited

// Option 2: Convert to iterative DFS
void iterativeDFS(int start, vector<vector<int>>& adj, vector<int>& order) {
    stack<pair<int, int>> stk;  // (node, edge_index)
    visited[start] = true;
    stk.push({start, 0});

    while (!stk.empty()) {
        auto& [v, i] = stk.top();
        if (i < (int)adj[v].size()) {
            int u = adj[v][i++];
            if (!visited[u]) {
                visited[u] = true;
                stk.push({u, 0});
            }
        } else {
            order.push_back(v);
            stk.pop();
        }
    }
}

5. Confusing SCC Numbering Order

  • Kosaraju's assigns SCC ids in topological order of the condensation DAG (source SCCs get lower ids).
  • Tarjan's assigns SCC ids in reverse topological order (sink SCCs get lower ids -- because SCCs are identified bottom-up as DFS unwinds).

This matters when you build the condensation or use SCC ids for DP. Always verify which order your implementation uses.

6. Not Deduplicating Condensation Edges

When building the condensation DAG, the same cross-SCC edge can appear multiple times (from different vertex pairs in the original graph). Always deduplicate.

Wrong:

for (int v = 0; v < V; v++)
    for (int u : adj[v])
        if (comp[v] != comp[u])
            dagAdj[comp[v]].push_back(comp[u]);  // Duplicates!

Right:

set<pair<int,int>> dagEdges;
for (int v = 0; v < V; v++)
    for (int u : adj[v])
        if (comp[v] != comp[u])
            dagEdges.insert({comp[v], comp[u]});

for (auto [a, b] : dagEdges)
    dagAdj[a].push_back(b);

7. 2-SAT: Forgetting the Contrapositive

Every clause (a OR b) produces two implications: (!a => b) AND (!b => a). A common mistake is adding only one implication edge. Both are needed for correctness.

Interview Tips

  1. Start with the definition: Before coding, explain what an SCC is and why it matters for the problem. Interviewers want to see you understand the concept, not just memorize the algorithm.

  2. Choose Kosaraju's for interviews: It is easier to explain ("two DFS passes, second on transpose in reverse finish order") and less error-prone to implement under pressure.

  3. Know the condensation trick: Many hard directed graph problems become easy after condensation. If the problem involves "groups" or "reachability" in a directed graph, think SCCs.

  4. 2-SAT pattern recognition: If a problem says "choose one from each pair" with pairwise constraints, it is likely 2-SAT. Model each choice as a boolean variable and each constraint as a clause.

  5. Complexity is always O(V + E): Both algorithms are linear. If the interviewer asks about optimizations, the answer is usually "it is already optimal."


Summary Cheat Sheet

SCC Quick Reference Concept Key Point SCC Definition Maximal set where every pair is mutually reachable Kosaraju 2 DFS passes + transpose. O(V+E). Easy to code. Tarjan 1 DFS + low-link + stack. O(V+E). No transpose needed. Condensation Contract SCCs -> DAG. Enables topo sort + DP. 2-SAT Build implication graph, find SCCs. SAT iff x and !x in diff SCCs. Bridges Tarjan's low-link: bridge iff low[v] > disc[u] (undirected). Make DAG strongly connected Add max(sources, sinks) edges to condensation. Cycle detection Directed cycle exists iff any SCC has size > 1. Interview tip Use Kosaraju for clarity. Think condensation for hard problems. Time complexity Always O(V + E) for both algorithms and condensation.