Strongly Connected Components (Tarjan, Kosaraju, 2-SAT)
Table of Contents
SCCs appear in dependency resolution, circuit analysis, and 2-SAT solvers. This guide covers Kosaraju's and Tarjan's algorithms with deep intuition, the condensation graph, and 2-SAT.
What is a Strongly Connected Component?
A Strongly Connected Component of a directed graph is a maximal set of vertices such that there is a path from every vertex in the set to every other vertex in the set.
Key properties:
- Directed graphs only -- the concept is meaningless for undirected graphs (where connected components suffice).
- Maximal -- you cannot add any more vertices and still have every pair mutually reachable.
- Every vertex belongs to exactly one SCC (even isolated vertices form their own singleton SCC).
- A single vertex with no self-loop is trivially an SCC of size 1.
Example Graph
Consider this directed graph with 8 vertices:
SCCs identified:
- SCC 1: {0, 1, 2} -- cycle: 0 -> 1 -> 2 -> 0
- SCC 2: {3, 4} -- cycle: 3 -> 4 -> 3
- SCC 3: {5, 6, 7} -- cycle: 5 -> 6 -> 7 -> 5
Cross-SCC edges go from SCC 1 to SCC 2 (1->3), from SCC 1 to SCC 3 (2->5), and from SCC 2 to SCC 3 (4->6). These cross edges are one-directional -- if there were a path back, the SCCs would merge.
Why SCCs Matter
- Condensation: Collapsing each SCC into a single node produces a DAG. Any problem on a general directed graph can be decomposed into a DAG problem + within-SCC reasoning.
- Reachability: Within an SCC, every node can reach every other node. Across SCCs, reachability follows the condensation DAG.
- 2-SAT: The satisfiability of 2-SAT formulas is determined entirely by the SCC structure of the implication graph.
Kosaraju's Algorithm
Kosaraju's algorithm finds all SCCs using two DFS passes and the transpose graph. It is conceptually elegant and easy to implement.
The Core Idea
- First DFS: Run DFS on the original graph. Record the finish order (the order in which DFS completes a node -- i.e., after all descendants are processed).
- Reverse the graph: Build the transpose graph (reverse all edge directions).
- Second DFS: Process nodes in reverse finish order on the transpose graph. Each DFS tree in this pass is one SCC.
Why It Works -- Deep Intuition
This is the key insight most people miss. Let us build the reasoning step by step.
Observation 1: Finish times and reachability. In DFS, if node u can reach node v, and they are in different DFS trees or u is an ancestor of v, then u finishes after v. The finish time captures a "topological-ish" ordering.
Observation 2: The transpose preserves SCCs. If u and v are in the same SCC in graph G, they are also in the same SCC in the transpose G^T (just reverse all paths).
Observation 3: Cross-SCC edges flip direction. If SCC_A has an edge to SCC_B in G, then in G^T, SCC_B has an edge to SCC_A. The condensation DAG is reversed.
Observation 4: The second DFS peels off SCCs in topological order. The node with the latest finish time in the first DFS belongs to a "source SCC" in the condensation DAG (an SCC with no incoming cross-SCC edges). In the transpose graph, this SCC becomes a "sink" -- it has no outgoing cross-SCC edges. So when we DFS from this node in G^T, we can only reach nodes within this same SCC. We mark them, remove the SCC, and repeat with the next unvisited node in reverse finish order.
Summary of why Kosaraju works:
1. First DFS gives finish order. The last-finished node is in a "source SCC"
of the condensation DAG.
2. In the transpose graph, that source SCC becomes a sink SCC.
DFS from any node in a sink SCC cannot escape that SCC.
3. So second DFS (in reverse finish order, on transpose) correctly
identifies one SCC per DFS tree.
4. After removing that SCC, the next node in reverse finish order belongs
to the next source SCC in the remaining condensation DAG. Repeat.
Step-by-Step Walkthrough
Using the same 8-node example graph from above:
Edges: 0->1, 1->2, 2->0, 1->3, 3->4, 4->3, 2->5, 4->6, 5->6, 6->7, 7->5
Adjacency list:
0: [1]
1: [2, 3]
2: [0, 5]
3: [4]
4: [3, 6]
5: [6]
6: [7]
7: [5]
Pass 1: DFS on original graph (recording finish order)
Start DFS from node 0:
Visit 0 -> Visit 1 -> Visit 2 -> Visit 0 (already visited)
-> Visit 5 -> Visit 6 -> Visit 7 -> Visit 5 (visited)
Finish 7 (order pos 0)
Finish 6 (order pos 1)
Finish 5 (order pos 2)
Finish 2 (order pos 3)
-> Visit 3 -> Visit 4 -> Visit 3 (visited)
-> Visit 6 (visited)
Finish 4 (order pos 4)
Finish 3 (order pos 5)
Finish 1 (order pos 6)
Finish 0 (order pos 7)
Finish order (first to last): [7, 6, 5, 2, 4, 3, 1, 0]
Reverse finish order: [0, 1, 3, 4, 2, 5, 6, 7]
Build transpose graph:
Transpose adjacency list (reverse all edges):
0: [2]
1: [0]
2: [1]
3: [1, 4]
4: [3]
5: [2, 7]
6: [5, 4]
7: [6]
Pass 2: DFS on transpose in reverse finish order
Process node 0 (unvisited):
DFS: 0 -> 2 -> 1 -> 0 (visited)
SCC 1 = {0, 2, 1} = {0, 1, 2}
Process node 3 (unvisited):
DFS: 3 -> 4 -> 3 (visited)
(1 already visited)
SCC 2 = {3, 4}
Process node 5 (unvisited):
DFS: 5 -> 7 -> 6 -> 5 (visited)
(2 already visited, 4 already visited)
SCC 3 = {5, 7, 6} = {5, 6, 7}
Result: {0,1,2}, {3,4}, {5,6,7} -- Correct!
Implementation
#include <bits/stdc++.h>
using namespace std;
class KosarajuSCC {
int n;
vector<vector<int>> adj, radj;
vector<bool> visited;
vector<int> order; // finish order
vector<int> comp; // comp[v] = SCC id of vertex v
void dfs1(int v) {
visited[v] = true;
for (int u : adj[v]) {
if (!visited[u])
dfs1(u);
}
order.push_back(v); // record finish time
}
void dfs2(int v, int c) {
comp[v] = c;
for (int u : radj[v]) {
if (comp[u] == -1)
dfs2(u, c);
}
}
public:
// Returns number of SCCs. comp[v] gives the SCC id for vertex v.
// SCC ids are in reverse topological order of the condensation DAG:
// if SCC a has an edge to SCC b, then a's id < b's id.
int findSCCs(int numNodes, vector<vector<int>>& graph) {
n = numNodes;
adj = graph;
radj.assign(n, {});
visited.assign(n, false);
comp.assign(n, -1);
order.clear();
// Build transpose graph
for (int v = 0; v < n; v++)
for (int u : adj[v])
radj[u].push_back(v);
// Pass 1: DFS on original graph
for (int v = 0; v < n; v++)
if (!visited[v])
dfs1(v);
// Pass 2: DFS on transpose in reverse finish order
int numSCCs = 0;
for (int i = n - 1; i >= 0; i--) {
int v = order[i];
if (comp[v] == -1) {
dfs2(v, numSCCs);
numSCCs++;
}
}
return numSCCs;
}
vector<int>& getComponents() { return comp; }
// Get list of vertices in each SCC
vector<vector<int>> getSCCLists() {
int numSCCs = *max_element(comp.begin(), comp.end()) + 1;
vector<vector<int>> sccs(numSCCs);
for (int v = 0; v < n; v++)
sccs[comp[v]].push_back(v);
return sccs;
}
};
int main() {
int n = 8;
vector<vector<int>> adj(n);
adj[0] = {1};
adj[1] = {2, 3};
adj[2] = {0, 5};
adj[3] = {4};
adj[4] = {3, 6};
adj[5] = {6};
adj[6] = {7};
adj[7] = {5};
KosarajuSCC scc;
int numSCCs = scc.findSCCs(n, adj);
cout << "Number of SCCs: " << numSCCs << "\n";
auto sccs = scc.getSCCLists();
for (int i = 0; i < numSCCs; i++) {
cout << "SCC " << i << ": ";
for (int v : sccs[i]) cout << v << " ";
cout << "\n";
}
return 0;
}
Complexity
- Time: O(V + E) -- two DFS passes, each O(V + E), plus O(E) to build the transpose graph.
- Space: O(V + E) -- for the transpose graph, visited array, order stack, and component labels.
Tarjan's Algorithm
Tarjan's algorithm finds all SCCs in a single DFS pass using the concept of low-link values and an explicit stack. It is more subtle than Kosaraju's but more efficient in practice (one pass, no transpose graph).
Key Concepts
Discovery time (disc[v]): The order in which DFS first visits node v.
Low-link value (low[v]): The smallest discovery time reachable from v through DFS tree edges and at most one back edge. Formally, low[v] = min discovery time of any node reachable from the subtree rooted at v that is currently on the stack.
On-stack tracking (onStack[v]): Whether node v is currently on the DFS stack. This is critical -- we only consider back edges to nodes that are still on the stack (i.e., nodes whose SCC has not yet been finalized).
Why It Works -- Deep Intuition
Core insight: A node v is the root of an SCC if and only if low[v] == disc[v].
Why? If low[v] == disc[v], it means no node in v's subtree can reach any node discovered before v (that is still on the stack). This means v and all nodes above it on the stack (up to v) form a self-contained strongly connected component -- they can all reach each other through paths within this group, but none of them can reach "upward" past v in the DFS tree.
The algorithm invariant:
1. Maintain a stack of nodes in the current DFS path and unassigned SCCs.
2. When we finish processing node v:
- If low[v] == disc[v], then v is the root of an SCC.
Pop everything from the stack down to and including v.
These popped nodes form one SCC.
- If low[v] < disc[v], then v can reach an ancestor.
v is NOT the root. Leave it on the stack.
3. low[v] is updated by:
- Tree edges: low[v] = min(low[v], low[child])
- Back/cross edges to on-stack nodes: low[v] = min(low[v], disc[neighbor])
- Edges to already-assigned nodes (not on stack): ignored
Why ignore nodes not on the stack? If a neighbor w has already been popped (assigned to an SCC), then w's SCC is fully resolved. Taking w's discovery time would incorrectly suggest we can reach ancestors through w, but we cannot -- w's SCC is a separate, finalized component.
Step-by-Step Walkthrough
Using the same 8-node graph:
Edges: 0->1, 1->2, 2->0, 1->3, 3->4, 4->3, 2->5, 4->6, 5->6, 6->7, 7->5
Adjacency list:
0: [1]
1: [2, 3]
2: [0, 5]
3: [4]
4: [3, 6]
5: [6]
6: [7]
7: [5]
DFS starts at node 0. Timer = 0.
Visit 0: disc[0]=0, low[0]=0, stack=[0]
Visit 1: disc[1]=1, low[1]=1, stack=[0,1]
Visit 2: disc[2]=2, low[2]=2, stack=[0,1,2]
Edge 2->0: 0 is on stack, low[2] = min(2, disc[0]) = min(2,0) = 0
Visit 5: disc[5]=3, low[5]=3, stack=[0,1,2,5]
Visit 6: disc[6]=4, low[6]=4, stack=[0,1,2,5,6]
Visit 7: disc[7]=5, low[7]=5, stack=[0,1,2,5,6,7]
Edge 7->5: 5 is on stack, low[7] = min(5, disc[5]) = min(5,3) = 3
Back to 7: low[7]=3 != disc[7]=5, not root. Return.
Back to 6: low[6] = min(4, low[7]) = min(4,3) = 3
Back to 6: low[6]=3 != disc[6]=4, not root. Return.
Back to 5: low[5] = min(3, low[6]) = min(3,3) = 3
Back to 5: low[5]=3 == disc[5]=3 => ROOT! Pop until 5:
Pop 7 -> SCC
Pop 6 -> SCC
Pop 5 -> SCC
SCC = {5, 6, 7}
Stack now: [0,1,2]
Back to 2: low[2] = min(0, --) = 0 (low was already 0 from edge 2->0)
Back to 2: low[2]=0 != disc[2]=2, not root. Return.
Back to 1: low[1] = min(1, low[2]) = min(1,0) = 0
Visit 3: disc[3]=6, low[3]=6, stack=[0,1,3]
Visit 4: disc[4]=7, low[4]=7, stack=[0,1,3,4]
Edge 4->3: 3 is on stack, low[4] = min(7, disc[3]) = min(7,6) = 6
Edge 4->6: 6 is NOT on stack (already assigned), skip.
Back to 4: low[4]=6 != disc[4]=7, not root. Return.
Back to 3: low[3] = min(6, low[4]) = min(6,6) = 6
Back to 3: low[3]=6 == disc[3]=6 => ROOT! Pop until 3:
Pop 4 -> SCC
Pop 3 -> SCC
SCC = {3, 4}
Stack now: [0,1,2]
Back to 1: low[1] = min(0, --) = 0 (unchanged)
Back to 1: low[1]=0 != disc[1]=1, not root. Return.
Back to 0: low[0] = min(0, low[1]) = min(0,0) = 0
Back to 0: low[0]=0 == disc[0]=0 => ROOT! Pop until 0:
Pop 2 -> SCC
Pop 1 -> SCC
Pop 0 -> SCC
SCC = {0, 1, 2}
Final SCCs: {5,6,7}, {3,4}, {0,1,2} -- Correct!
Detailed stack trace:
Visit 0: disc=0 low=0 stack=[0]
Visit 1: disc=1 low=1 stack=[0,1]
Visit 2: disc=2 low=2 stack=[0,1,2]
2->0: on stack, low[2]=min(2,0)=0
Visit 5: disc=3 low=3 stack=[0,1,2,5]
Visit 6: disc=4 low=4 stack=[0,1,2,5,6]
Visit 7: disc=5 low=5 stack=[0,1,2,5,6,7]
7->5: on stack, low[7]=min(5,3)=3
Done 7: low=3 != disc=5, not root
Back at 6: low[6]=min(4,3)=3
Done 6: low=3 != disc=4, not root
Back at 5: low[5]=min(3,3)=3
Done 5: low=3 == disc=3 => ROOT!
Pop: 7, 6, 5 => SCC = {5, 6, 7}
stack=[0,1,2]
Back at 2: (5 already handled, low[2] stays 0)
Done 2: low=0 != disc=2, not root
Back at 1: low[1]=min(1,0)=0
Visit 3: disc=6 low=6 stack=[0,1,2,3]
Visit 4: disc=7 low=7 stack=[0,1,2,3,4]
4->3: on stack, low[4]=min(7,6)=6
4->6: NOT on stack (already in SCC), skip
Done 4: low=6 != disc=7, not root
Back at 3: low[3]=min(6,6)=6
Done 3: low=6 == disc=6 => ROOT!
Pop: 4, 3 => SCC = {3, 4}
stack=[0,1,2]
Back at 1: low[1] stays 0
Done 1: low=0 != disc=1, not root
Back at 0: low[0]=min(0,0)=0
Done 0: low=0 == disc=0 => ROOT!
Pop: 2, 1, 0 => SCC = {0, 1, 2}
stack=[]
Final SCCs: {5,6,7}, {3,4}, {0,1,2} -- Correct!
Implementation
#include <bits/stdc++.h>
using namespace std;
class TarjanSCC {
int n, timer, numSCCs;
vector<vector<int>> adj;
vector<int> disc, low, comp;
vector<bool> onStack;
stack<int> st;
void dfs(int v) {
disc[v] = low[v] = timer++;
st.push(v);
onStack[v] = true;
for (int u : adj[v]) {
if (disc[u] == -1) {
// Tree edge
dfs(u);
low[v] = min(low[v], low[u]);
} else if (onStack[u]) {
// Back edge or cross edge to node in current path
low[v] = min(low[v], disc[u]);
}
// If u is already assigned to an SCC (not on stack), ignore
}
// If v is root of an SCC
if (low[v] == disc[v]) {
while (true) {
int u = st.top();
st.pop();
onStack[u] = false;
comp[u] = numSCCs;
if (u == v) break;
}
numSCCs++;
}
}
public:
// Returns number of SCCs. comp[v] gives SCC id of vertex v.
// SCC ids are in reverse topological order of the condensation DAG.
int findSCCs(int numNodes, vector<vector<int>>& graph) {
n = numNodes;
adj = graph;
timer = 0;
numSCCs = 0;
disc.assign(n, -1);
low.assign(n, -1);
comp.assign(n, -1);
onStack.assign(n, false);
for (int v = 0; v < n; v++)
if (disc[v] == -1)
dfs(v);
return numSCCs;
}
vector<int>& getComponents() { return comp; }
vector<vector<int>> getSCCLists() {
vector<vector<int>> sccs(numSCCs);
for (int v = 0; v < n; v++)
sccs[comp[v]].push_back(v);
return sccs;
}
};
int main() {
int n = 8;
vector<vector<int>> adj(n);
adj[0] = {1};
adj[1] = {2, 3};
adj[2] = {0, 5};
adj[3] = {4};
adj[4] = {3, 6};
adj[5] = {6};
adj[6] = {7};
adj[7] = {5};
TarjanSCC scc;
int numSCCs = scc.findSCCs(n, adj);
cout << "Number of SCCs: " << numSCCs << "\n";
auto sccs = scc.getSCCLists();
for (int i = 0; i < numSCCs; i++) {
cout << "SCC " << i << ": ";
for (int v : sccs[i]) cout << v << " ";
cout << "\n";
}
return 0;
}
Complexity
- Time: O(V + E) -- single DFS pass, each edge and vertex processed once.
- Space: O(V) -- for disc, low, comp arrays, on-stack flags, and the explicit stack.
Subtle Point: disc[u] vs low[u] in Back Edges
When we encounter a back edge to an on-stack node u, we write:
low[v] = min(low[v], disc[u]); // Tarjan's original formulation
Some implementations use low[u] instead:
low[v] = min(low[v], low[u]); // Also correct for finding SCCs
Both correctly identify SCC roots, but disc[u] is the standard Tarjan's formulation. Using low[u] can give different low-link values but still yields the same SCCs because the root condition low[v] == disc[v] works either way.
Comparison: Kosaraju vs Tarjan
When to Use Which
- Kosaraju's: When you want simplicity and clarity, or when you also need the transpose graph for other purposes. Great for interviews when you need to explain your approach clearly.
- Tarjan's: When memory matters (no transpose graph), when you need a single-pass solution, or when you also want to find bridges/articulation points. Preferred in competitive programming.
- Both give SCCs in (reverse) topological order of the condensation DAG, which is useful for building the condensation.
Condensation Graph
The condensation of a directed graph is the DAG formed by contracting each SCC into a single node. Cross-SCC edges become edges in the condensation.
Why It Matters
The condensation transforms any directed graph into a DAG. This unlocks all DAG techniques:
- Topological sort
- DP on DAGs (longest path, counting paths, etc.)
- Reachability queries
Visual: From Graph to Condensation
Building the Condensation
#include <bits/stdc++.h>
using namespace std;
// After running Kosaraju or Tarjan to get comp[v] for each vertex
// and numSCCs total components:
struct Condensation {
int numNodes; // number of super-nodes (SCCs)
vector<vector<int>> adj; // adjacency list of the DAG
vector<int> sccSize; // number of vertices in each SCC
vector<vector<int>> sccMembers; // vertices in each SCC
void build(int n, vector<vector<int>>& originalAdj,
vector<int>& comp, int numSCCs) {
numNodes = numSCCs;
adj.assign(numSCCs, {});
sccSize.assign(numSCCs, 0);
sccMembers.assign(numSCCs, {});
// Record SCC membership
for (int v = 0; v < n; v++) {
sccSize[comp[v]]++;
sccMembers[comp[v]].push_back(v);
}
// Build DAG edges (deduplicate with a set)
set<pair<int,int>> dagEdges;
for (int v = 0; v < n; v++) {
for (int u : originalAdj[v]) {
if (comp[v] != comp[u]) {
dagEdges.insert({comp[v], comp[u]});
}
}
}
for (auto [a, b] : dagEdges) {
adj[a].push_back(b);
}
}
};
Application: DP on Condensation DAG
A common pattern: assign values to nodes, collapse SCCs (all nodes in an SCC share their combined value), then run DP on the DAG.
Example: Find the maximum sum of node values you can collect starting from any node and following directed edges.
// After building condensation with sccSize[] and DAG adj[]:
// Suppose each original node has value val[v].
// sccVal[c] = sum of val[v] for all v in SCC c.
vector<long long> sccVal(numSCCs, 0);
for (int v = 0; v < n; v++)
sccVal[comp[v]] += val[v];
// Topological sort the condensation DAG
vector<int> indeg(numSCCs, 0);
for (int c = 0; c < numSCCs; c++)
for (int d : dagAdj[c])
indeg[d]++;
queue<int> q;
for (int c = 0; c < numSCCs; c++)
if (indeg[c] == 0)
q.push(c);
vector<long long> dp(numSCCs);
for (int c = 0; c < numSCCs; c++)
dp[c] = sccVal[c];
// Process in topological order
vector<int> topoOrder;
while (!q.empty()) {
int c = q.front(); q.pop();
topoOrder.push_back(c);
for (int d : dagAdj[c]) {
dp[d] = max(dp[d], dp[c] + sccVal[d]);
if (--indeg[d] == 0)
q.push(d);
}
}
long long answer = *max_element(dp.begin(), dp.end());
Complexity
- Building condensation: O(V + E)
- DP on condensation: O(V_scc + E_scc) where V_scc and E_scc are vertices and edges of the condensation DAG.
2-SAT (Boolean Satisfiability)
2-SAT is a special case of the Boolean satisfiability problem where each clause has exactly 2 literals. It can be solved in polynomial time using SCCs -- a beautiful connection between logic and graph theory.
The Problem
Given n boolean variables x_1, x_2, ..., x_n and m clauses, each of the form (a OR b) where a, b are literals (x_i or NOT x_i), determine if there is an assignment of true/false to all variables that satisfies all clauses simultaneously.
The Implication Graph
Every clause (a OR b) is equivalent to two implications:
(a OR b) is equivalent to (NOT a => b) AND (NOT b => a)
Examples:
(x1 OR x2) => (!x1 => x2) AND (!x2 => x1)
(x1 OR !x3) => (!x1 => !x3) AND (x3 => x1)
(!x2 OR !x3) => (x2 => !x3) AND (x3 => !x2)
We build a directed graph (the implication graph) with 2n nodes: for each variable x_i, we have nodes x_i (true) and !x_i (false). For each clause, we add the two implication edges.
The SCC Connection
Theorem: A 2-SAT formula is satisfiable if and only if no variable x_i and its negation !x_i are in the same SCC.
Why? If x_i and !x_i are in the same SCC, there is a path from x_i to !x_i AND from !x_i to x_i. This means x_i => !x_i and !x_i => x_i, which is a contradiction (x_i must be both true and false).
Extracting the Assignment
Once we know the formula is satisfiable, we extract the assignment using the topological order of the condensation:
Rule: For each variable x_i, if comp[x_i] > comp[!x_i] (in Kosaraju's topological numbering, where higher id = later in topological order), set x_i = true. Otherwise, set x_i = false. (If using Tarjan's, the comparison is reversed since Tarjan assigns ids in reverse topological order.)
Intuition: In the condensation DAG, implications flow forward. If !x_i's SCC comes before x_i's SCC in topological order, then the chain of implications leads from "false" to "true", meaning we should assign x_i = true (the implications push us toward truth).
Complete 2-SAT Solver
#include <bits/stdc++.h>
using namespace std;
class TwoSAT {
int n; // number of variables (0-indexed)
vector<vector<int>> adj, radj;
vector<int> order, comp;
vector<bool> visited;
void dfs1(int v) {
visited[v] = true;
for (int u : adj[v])
if (!visited[u])
dfs1(u);
order.push_back(v);
}
void dfs2(int v, int c) {
comp[v] = c;
for (int u : radj[v])
if (comp[u] == -1)
dfs2(u, c);
}
public:
// n = number of boolean variables (0-indexed: x_0, x_1, ..., x_{n-1})
// Node encoding: variable x_i -> node 2*i (true), node 2*i+1 (false/negation)
TwoSAT(int n) : n(n), adj(2 * n), radj(2 * n) {}
// Helper: given a literal, return its graph node index.
// Literal i (positive) -> 2*i, literal ~i (negative) -> 2*i+1
// In practice, use addClause with the raw variable index and a sign.
// Add clause (a OR b).
// a is encoded as: variable index + sign. Pass (var, true) for x_var,
// (var, false) for !x_var.
void addClause(int u, bool nu, int v, bool nv) {
// (u=nu OR v=nv)
// !nu => nv AND !nv => nu
// node for (u, nu): 2*u + (nu ? 0 : 1)
// node for (u, !nu): 2*u + (nu ? 1 : 0)
int a = 2 * u + (nu ? 0 : 1); // node for literal u=nu
int na = 2 * u + (nu ? 1 : 0); // node for literal u=!nu
int b = 2 * v + (nv ? 0 : 1); // node for literal v=nv
int nb = 2 * v + (nv ? 1 : 0); // node for literal v=!nv
// Implication: !a => b (if a is false, b must be true)
adj[na].push_back(b);
radj[b].push_back(na);
// Implication: !b => a (if b is false, a must be true)
adj[nb].push_back(a);
radj[a].push_back(nb);
}
// Add constraint: variable u must be val (true/false).
// This is equivalent to the clause (u=val OR u=val).
void setTrue(int u, bool val) {
addClause(u, val, u, val);
}
// Solve: returns true if satisfiable, fills 'result' with assignment.
bool solve(vector<bool>& result) {
int N = 2 * n;
visited.assign(N, false);
order.clear();
comp.assign(N, -1);
// Kosaraju pass 1
for (int i = 0; i < N; i++)
if (!visited[i])
dfs1(i);
// Kosaraju pass 2
int numSCCs = 0;
for (int i = N - 1; i >= 0; i--) {
int v = order[i];
if (comp[v] == -1)
dfs2(v, numSCCs++);
}
// Check satisfiability
result.assign(n, false);
for (int i = 0; i < n; i++) {
if (comp[2 * i] == comp[2 * i + 1])
return false; // x_i and !x_i in same SCC => contradiction
// In Kosaraju, earlier SCC id = earlier in topological order.
// If comp[2i] > comp[2i+1], then x_i's SCC comes later
// (topologically after !x_i), so x_i = true.
result[i] = comp[2 * i] > comp[2 * i + 1];
}
return true;
}
};
int main() {
// Example: (x0 OR x1) AND (x0 OR !x2) AND (!x1 OR !x2)
// Variables: x0, x1, x2
TwoSAT sat(3);
sat.addClause(0, true, 1, true); // x0 OR x1
sat.addClause(0, true, 2, false); // x0 OR !x2
sat.addClause(1, false, 2, false); // !x1 OR !x2
vector<bool> result;
if (sat.solve(result)) {
cout << "Satisfiable!\n";
for (int i = 0; i < 3; i++)
cout << "x" << i << " = " << (result[i] ? "true" : "false") << "\n";
} else {
cout << "Unsatisfiable\n";
}
// Output: x0 = true, x1 = true, x2 = false (one valid assignment)
return 0;
}
Walkthrough Example
Formula: (x0 OR x1) AND (x0 OR !x2) AND (!x1 OR !x2)
Implications:
Clause 1: !x0 => x1, !x1 => x0
Clause 2: !x0 => !x2, x2 => x0
Clause 3: x1 => !x2, x2 => !x1
Implication graph (6 nodes: x0, !x0, x1, !x1, x2, !x2):
!x0 -> x1, !x2
!x1 -> x0
x1 -> !x2
x2 -> x0, !x1
!x2 -> (no outgoing)
x0 -> (no outgoing)
SCCs: Each node is its own SCC (no cycles).
Check: x0 and !x0 in different SCCs? Yes.
x1 and !x1 in different SCCs? Yes.
x2 and !x2 in different SCCs? Yes.
=> Satisfiable!
Assignment (using topological order):
x0: comp[x0] > comp[!x0]? x0 is a sink (later topo order) => true
x1: comp[x1] > comp[!x1]? x1 reachable from !x0 => true
x2: comp[x2] > comp[!x2]? !x2 is the sink => x2 = false
Result: x0=true, x1=true, x2=false
Verify: (T OR T)=T, (T OR T)=T, (F OR T)=T => All satisfied!
Complexity
- Time: O(n + m) where n = number of variables, m = number of clauses. The implication graph has 2n nodes and 2m edges.
- Space: O(n + m)
Classic Problems with Full Solutions
Problem 1: Number of Strongly Connected Components
Problem: Given a directed graph with V vertices and E edges, find the number of SCCs.
This is the direct application of Kosaraju's or Tarjan's algorithm.
#include <bits/stdc++.h>
using namespace std;
int main() {
int V, E;
cin >> V >> E;
vector<vector<int>> adj(V), radj(V);
for (int i = 0; i < E; i++) {
int u, v;
cin >> u >> v;
adj[u].push_back(v);
radj[v].push_back(u);
}
// Kosaraju's algorithm
vector<bool> visited(V, false);
vector<int> order;
// Pass 1: DFS on original graph
function<void(int)> dfs1 = [&](int v) {
visited[v] = true;
for (int u : adj[v])
if (!visited[u])
dfs1(u);
order.push_back(v);
};
for (int i = 0; i < V; i++)
if (!visited[i])
dfs1(i);
// Pass 2: DFS on transpose in reverse finish order
vector<int> comp(V, -1);
int numSCCs = 0;
function<void(int, int)> dfs2 = [&](int v, int c) {
comp[v] = c;
for (int u : radj[v])
if (comp[u] == -1)
dfs2(u, c);
};
for (int i = V - 1; i >= 0; i--) {
if (comp[order[i]] == -1) {
dfs2(order[i], numSCCs);
numSCCs++;
}
}
cout << numSCCs << "\n";
return 0;
}
Complexity: Time O(V + E), Space O(V + E).
Problem 2: Critical Connections (Bridges) in a Graph (LeetCode 1192)
Problem: Given an undirected graph, find all bridges -- edges whose removal disconnects the graph.
While this problem is about undirected graphs and bridges (not SCCs directly), it uses the same low-link technique from Tarjan's algorithm. A bridge is an edge (u, v) where low[v] > disc[u] -- meaning v cannot reach u or any of u's ancestors without using the edge (u, v).
#include <bits/stdc++.h>
using namespace std;
class Solution {
public:
vector<vector<int>> criticalConnections(int n, vector<vector<int>>& connections) {
vector<vector<int>> adj(n);
for (auto& e : connections) {
adj[e[0]].push_back(e[1]);
adj[e[1]].push_back(e[0]);
}
vector<int> disc(n, -1), low(n, -1);
vector<vector<int>> bridges;
int timer = 0;
function<void(int, int)> dfs = [&](int u, int parent) {
disc[u] = low[u] = timer++;
for (int v : adj[u]) {
if (v == parent) continue; // skip the edge we came from
if (disc[v] == -1) {
// Tree edge
dfs(v, u);
low[u] = min(low[u], low[v]);
// Bridge condition: v cannot reach u or above
if (low[v] > disc[u]) {
bridges.push_back({u, v});
}
} else {
// Back edge
low[u] = min(low[u], disc[v]);
}
}
};
for (int i = 0; i < n; i++)
if (disc[i] == -1)
dfs(i, -1);
return bridges;
}
};
int main() {
int n = 4;
vector<vector<int>> connections = {{0,1},{1,2},{2,0},{1,3}};
Solution sol;
auto bridges = sol.criticalConnections(n, connections);
for (auto& b : bridges)
cout << b[0] << " - " << b[1] << "\n";
// Output: 1 - 3 (removing edge 1-3 disconnects node 3)
return 0;
}
Why this is related to SCCs: In a directed graph, bridges (in the undirected sense) are edges that, if removed, increase the number of connected components. In the directed version, Tarjan's low-link values detect exactly when a subtree has no back edge reaching above -- the same principle underlying both bridge detection and SCC detection.
Complexity: Time O(V + E), Space O(V + E).
Problem 3: Course Schedule II / Detect Cycle in Directed Graph (LeetCode 210)
Problem: Given numCourses and prerequisite pairs, return an ordering of courses you should take to finish all courses. If impossible (cycle exists), return empty.
A cycle in a directed graph means at least one SCC has size > 1. If all SCCs are singletons, the graph is a DAG and a topological order exists.
#include <bits/stdc++.h>
using namespace std;
class Solution {
public:
vector<int> findOrder(int numCourses, vector<vector<int>>& prerequisites) {
int n = numCourses;
vector<vector<int>> adj(n), radj(n);
for (auto& p : prerequisites) {
// p[0] depends on p[1]: edge from p[1] to p[0]
adj[p[1]].push_back(p[0]);
radj[p[0]].push_back(p[1]);
}
// Kosaraju to find SCCs
vector<bool> visited(n, false);
vector<int> order;
function<void(int)> dfs1 = [&](int v) {
visited[v] = true;
for (int u : adj[v])
if (!visited[u])
dfs1(u);
order.push_back(v);
};
for (int i = 0; i < n; i++)
if (!visited[i])
dfs1(i);
vector<int> comp(n, -1);
vector<int> sccSize;
int numSCCs = 0;
function<void(int, int)> dfs2 = [&](int v, int c) {
comp[v] = c;
for (int u : radj[v])
if (comp[u] == -1)
dfs2(u, c);
};
for (int i = n - 1; i >= 0; i--) {
if (comp[order[i]] == -1) {
dfs2(order[i], numSCCs);
numSCCs++;
}
}
// Check for cycles: any SCC with size > 1 means a cycle
vector<int> sizes(numSCCs, 0);
for (int i = 0; i < n; i++)
sizes[comp[i]]++;
for (int s : sizes)
if (s > 1)
return {}; // Cycle detected
// Also check for self-loops
for (int v = 0; v < n; v++)
for (int u : adj[v])
if (u == v)
return {};
// No cycle => topological order is the reverse finish order from pass 1
vector<int> result(order.rbegin(), order.rend());
return result;
}
};
int main() {
int numCourses = 4;
vector<vector<int>> prereqs = {{1,0},{2,0},{3,1},{3,2}};
Solution sol;
auto order = sol.findOrder(numCourses, prereqs);
if (order.empty()) {
cout << "Impossible (cycle detected)\n";
} else {
for (int c : order) cout << c << " ";
cout << "\n";
}
// Output: 0 1 2 3 (or 0 2 1 3)
return 0;
}
Note: For this specific problem, Kahn's algorithm (BFS topological sort) is simpler. But the SCC approach generalizes -- it tells you exactly which courses form cycles, which is useful for debugging and follow-up questions.
Complexity: Time O(V + E), Space O(V + E).
Problem 4: 2-SAT Application -- Party Invitation
Problem: You are organizing a party with n pairs of people. From each pair, you must invite exactly one person. Some people have conflicts: certain pairs of people cannot both attend. Determine if a valid invitation list exists, and if so, find one.
This is a classic 2-SAT problem. For each pair i, let x_i = true mean "invite person A from pair i" and x_i = false mean "invite person B from pair i". A conflict between person A of pair i and person B of pair j translates to the clause: "NOT(x_i AND !x_j)" which is "(!x_i OR x_j)".
#include <bits/stdc++.h>
using namespace std;
class TwoSAT {
int n;
vector<vector<int>> adj, radj;
vector<int> order, comp;
vector<bool> visited;
void dfs1(int v) {
visited[v] = true;
for (int u : adj[v])
if (!visited[u]) dfs1(u);
order.push_back(v);
}
void dfs2(int v, int c) {
comp[v] = c;
for (int u : radj[v])
if (comp[u] == -1) dfs2(u, c);
}
public:
TwoSAT(int n) : n(n), adj(2 * n), radj(2 * n) {}
void addClause(int u, bool nu, int v, bool nv) {
int a = 2 * u + (nu ? 0 : 1);
int na = 2 * u + (nu ? 1 : 0);
int b = 2 * v + (nv ? 0 : 1);
int nb = 2 * v + (nv ? 1 : 0);
adj[na].push_back(b);
radj[b].push_back(na);
adj[nb].push_back(a);
radj[a].push_back(nb);
}
bool solve(vector<bool>& result) {
int N = 2 * n;
visited.assign(N, false);
order.clear();
comp.assign(N, -1);
for (int i = 0; i < N; i++)
if (!visited[i]) dfs1(i);
int numSCCs = 0;
for (int i = N - 1; i >= 0; i--)
if (comp[order[i]] == -1) dfs2(order[i], numSCCs++);
result.assign(n, false);
for (int i = 0; i < n; i++) {
if (comp[2 * i] == comp[2 * i + 1]) return false;
result[i] = comp[2 * i] > comp[2 * i + 1];
}
return true;
}
};
int main() {
// 3 pairs: (A0,B0), (A1,B1), (A2,B2)
// Conflicts: A0 and B1 cannot both attend, A1 and A2 cannot both attend
int numPairs = 3;
TwoSAT sat(numPairs);
// Conflict: A0 and B1 can't both come
// A0 = x0=true, B1 = x1=false
// Clause: NOT(x0 AND !x1) = (!x0 OR x1)
sat.addClause(0, false, 1, true); // !x0 OR x1
// Conflict: A1 and A2 can't both come
// A1 = x1=true, A2 = x2=true
// Clause: NOT(x1 AND x2) = (!x1 OR !x2)
sat.addClause(1, false, 2, false); // !x1 OR !x2
vector<bool> result;
if (sat.solve(result)) {
cout << "Valid invitation:\n";
for (int i = 0; i < numPairs; i++)
cout << "Pair " << i << ": invite person "
<< (result[i] ? "A" : "B") << "\n";
} else {
cout << "No valid invitation exists\n";
}
return 0;
}
Complexity: Time O(n + m), Space O(n + m) where m = number of conflict constraints.
Problem 5: Maximum Groups That Can Communicate (SCC + Condensation)
Problem: In a directed communication network, a group of people can all communicate if they form an SCC (every person can reach every other). Some groups can send messages to other groups (cross-SCC edges). Find: (a) the number of groups (SCCs), and (b) the minimum number of edges to add so that everyone can communicate with everyone (i.e., make the entire graph one single SCC).
Key insight for part (b): After condensation, we have a DAG. To make a DAG strongly connected, we need to add max(sources, sinks) edges, where sources are nodes with in-degree 0 and sinks are nodes with out-degree 0 in the condensation DAG. (Special case: if there is only 1 SCC already, the answer is 0.)
#include <bits/stdc++.h>
using namespace std;
int main() {
int V, E;
cin >> V >> E;
vector<vector<int>> adj(V), radj(V);
for (int i = 0; i < E; i++) {
int u, v;
cin >> u >> v;
adj[u].push_back(v);
radj[v].push_back(u);
}
// Kosaraju's SCC
vector<bool> visited(V, false);
vector<int> order;
function<void(int)> dfs1 = [&](int v) {
visited[v] = true;
for (int u : adj[v])
if (!visited[u]) dfs1(u);
order.push_back(v);
};
for (int i = 0; i < V; i++)
if (!visited[i]) dfs1(i);
vector<int> comp(V, -1);
int numSCCs = 0;
function<void(int, int)> dfs2 = [&](int v, int c) {
comp[v] = c;
for (int u : radj[v])
if (comp[u] == -1) dfs2(u, c);
};
for (int i = V - 1; i >= 0; i--)
if (comp[order[i]] == -1)
dfs2(order[i], numSCCs++);
cout << "Number of SCCs (groups): " << numSCCs << "\n";
if (numSCCs == 1) {
cout << "Already fully connected. Edges to add: 0\n";
return 0;
}
// Build condensation and count sources/sinks
vector<int> indeg(numSCCs, 0), outdeg(numSCCs, 0);
set<pair<int,int>> dagEdges;
for (int v = 0; v < V; v++) {
for (int u : adj[v]) {
if (comp[v] != comp[u] && dagEdges.insert({comp[v], comp[u]}).second) {
outdeg[comp[v]]++;
indeg[comp[u]]++;
}
}
}
int sources = 0, sinks = 0;
for (int i = 0; i < numSCCs; i++) {
if (indeg[i] == 0) sources++;
if (outdeg[i] == 0) sinks++;
}
int edgesToAdd = max(sources, sinks);
cout << "Edges to add for full connectivity: " << edgesToAdd << "\n";
return 0;
}
Why max(sources, sinks)? Each added edge can eliminate at most one source and one sink. We need to eliminate all sources and all sinks, so we need at least max(sources, sinks) edges. A constructive proof shows this is also sufficient: connect each sink to a source in a round-robin fashion.
Complexity: Time O(V + E), Space O(V + E).
Problem 6: Satisfiability of Equality Equations (LeetCode 990) via SCC Thinking
Problem: Given an array of equations like ["a==b","b!=c","c==a"], determine if all equations can be simultaneously satisfied.
While this is typically solved with Union-Find, the SCC perspective provides insight: equality constraints form undirected edges (symmetric implications), and inequality constraints add forbidden pairs. If a forbidden pair ends up in the same connected component (SCC in the undirected sense), the answer is false.
#include <bits/stdc++.h>
using namespace std;
class Solution {
public:
bool equationsPossible(vector<string>& equations) {
// Union-Find (DSU) approach, motivated by SCC thinking
vector<int> parent(26);
iota(parent.begin(), parent.end(), 0);
function<int(int)> find = [&](int x) {
return parent[x] == x ? x : parent[x] = find(parent[x]);
};
// First pass: process all equality constraints (== edges)
for (auto& eq : equations) {
if (eq[1] == '=') {
int a = eq[0] - 'a', b = eq[3] - 'a';
parent[find(a)] = find(b);
}
}
// Second pass: check inequality constraints
for (auto& eq : equations) {
if (eq[1] == '!') {
int a = eq[0] - 'a', b = eq[3] - 'a';
if (find(a) == find(b))
return false; // Contradiction: a and b forced equal but also unequal
}
}
return true;
}
};
int main() {
vector<string> eq1 = {"a==b", "b!=c", "c==a"};
vector<string> eq2 = {"a==b", "b==c", "a!=c"};
Solution sol;
cout << (sol.equationsPossible(eq1) ? "true" : "false") << "\n"; // false
cout << (sol.equationsPossible(eq2) ? "true" : "false") << "\n"; // false
return 0;
}
Connection to SCCs: In a directed implication graph, a == b would be modeled as a => b AND b => a, placing them in the same SCC. An inequality a != b says they must be in different SCCs. This is conceptually identical to how 2-SAT works.
Complexity: Time O(n * alpha(26)) which is effectively O(n), Space O(1).
Problem 7: Minimum Edges to Make All Nodes Reachable from a Single Source
Problem: Given a directed graph, find the minimum number of edges to add so that all nodes are reachable from node 0.
Approach: Find SCCs, build condensation DAG. Count the number of source nodes (in-degree 0) in the condensation DAG, excluding the SCC that contains node 0. The answer is the number of such sources (we need to add an edge from some reachable node to each unreachable source SCC).
#include <bits/stdc++.h>
using namespace std;
int main() {
int V, E;
cin >> V >> E;
vector<vector<int>> adj(V), radj(V);
for (int i = 0; i < E; i++) {
int u, v;
cin >> u >> v;
adj[u].push_back(v);
radj[v].push_back(u);
}
// Kosaraju's SCC
vector<bool> visited(V, false);
vector<int> order;
function<void(int)> dfs1 = [&](int v) {
visited[v] = true;
for (int u : adj[v])
if (!visited[u]) dfs1(u);
order.push_back(v);
};
for (int i = 0; i < V; i++)
if (!visited[i]) dfs1(i);
vector<int> comp(V, -1);
int numSCCs = 0;
function<void(int, int)> dfs2 = [&](int v, int c) {
comp[v] = c;
for (int u : radj[v])
if (comp[u] == -1) dfs2(u, c);
};
for (int i = V - 1; i >= 0; i--)
if (comp[order[i]] == -1)
dfs2(order[i], numSCCs++);
// Build condensation: find in-degree of each SCC
vector<int> indeg(numSCCs, 0);
set<pair<int,int>> dagEdges;
for (int v = 0; v < V; v++)
for (int u : adj[v])
if (comp[v] != comp[u] && dagEdges.insert({comp[v], comp[u]}).second)
indeg[comp[u]]++;
// Count source SCCs excluding the one containing node 0
int sourceSCC = comp[0];
int answer = 0;
for (int i = 0; i < numSCCs; i++)
if (indeg[i] == 0 && i != sourceSCC)
answer++;
cout << "Minimum edges to add: " << answer << "\n";
return 0;
}
Complexity: Time O(V + E), Space O(V + E).
Common Mistakes & Interview Tips
1. Forgetting That SCCs Only Apply to Directed Graphs
For undirected graphs, use connected components (BFS/DFS) or biconnected components. SCCs are a directed graph concept. If the interviewer gives you an undirected graph and asks about "strongly connected components," clarify -- they likely mean connected components or biconnected components.
2. Mixing Up disc[u] and low[u] in Tarjan's
Wrong:
// Using low[u] for back edges to nodes on stack
low[v] = min(low[v], low[u]); // Works for SCC detection but gives wrong low values
Right (Tarjan's original):
if (onStack[u]) {
low[v] = min(low[v], disc[u]); // Use discovery time, not low-link
}
Both identify SCCs correctly, but using disc[u] is the standard formulation and is important if you later want to extend the algorithm for bridge/articulation point detection.
3. Not Tracking the On-Stack Flag in Tarjan's
Wrong:
if (disc[u] != -1) {
low[v] = min(low[v], disc[u]); // BUG: u might already be in a finished SCC
}
Right:
if (disc[u] != -1 && onStack[u]) {
low[v] = min(low[v], disc[u]); // Only consider nodes still on the stack
}
Without onStack, you might incorrectly link to a node that belongs to an already-finalized SCC, producing wrong results.
4. Stack Overflow on Large Graphs
Both Kosaraju's and Tarjan's use recursive DFS, which can overflow the stack for graphs with 100K+ nodes. Solutions:
// Option 1: Increase stack size (platform-specific)
// On Linux: ulimit -s unlimited
// Option 2: Convert to iterative DFS
void iterativeDFS(int start, vector<vector<int>>& adj, vector<int>& order) {
stack<pair<int, int>> stk; // (node, edge_index)
visited[start] = true;
stk.push({start, 0});
while (!stk.empty()) {
auto& [v, i] = stk.top();
if (i < (int)adj[v].size()) {
int u = adj[v][i++];
if (!visited[u]) {
visited[u] = true;
stk.push({u, 0});
}
} else {
order.push_back(v);
stk.pop();
}
}
}
5. Confusing SCC Numbering Order
- Kosaraju's assigns SCC ids in topological order of the condensation DAG (source SCCs get lower ids).
- Tarjan's assigns SCC ids in reverse topological order (sink SCCs get lower ids -- because SCCs are identified bottom-up as DFS unwinds).
This matters when you build the condensation or use SCC ids for DP. Always verify which order your implementation uses.
6. Not Deduplicating Condensation Edges
When building the condensation DAG, the same cross-SCC edge can appear multiple times (from different vertex pairs in the original graph). Always deduplicate.
Wrong:
for (int v = 0; v < V; v++)
for (int u : adj[v])
if (comp[v] != comp[u])
dagAdj[comp[v]].push_back(comp[u]); // Duplicates!
Right:
set<pair<int,int>> dagEdges;
for (int v = 0; v < V; v++)
for (int u : adj[v])
if (comp[v] != comp[u])
dagEdges.insert({comp[v], comp[u]});
for (auto [a, b] : dagEdges)
dagAdj[a].push_back(b);
7. 2-SAT: Forgetting the Contrapositive
Every clause (a OR b) produces two implications: (!a => b) AND (!b => a). A common mistake is adding only one implication edge. Both are needed for correctness.
Interview Tips
-
Start with the definition: Before coding, explain what an SCC is and why it matters for the problem. Interviewers want to see you understand the concept, not just memorize the algorithm.
-
Choose Kosaraju's for interviews: It is easier to explain ("two DFS passes, second on transpose in reverse finish order") and less error-prone to implement under pressure.
-
Know the condensation trick: Many hard directed graph problems become easy after condensation. If the problem involves "groups" or "reachability" in a directed graph, think SCCs.
-
2-SAT pattern recognition: If a problem says "choose one from each pair" with pairwise constraints, it is likely 2-SAT. Model each choice as a boolean variable and each constraint as a clause.
-
Complexity is always O(V + E): Both algorithms are linear. If the interviewer asks about optimizations, the answer is usually "it is already optimal."