Theory & Mathematics

NP-Complete Problems (TSP, Knapsack, etc.)

NP-Complete Problems

Interview Significance: Recognizing NP-complete problems is critical. If you spend 30 minutes trying to find a polynomial algorithm for a problem that is NP-complete, you have wasted your interview. Instead, you should immediately pivot to approximation algorithms, exponential solutions with pruning, or pseudo-polynomial DP.


Complexity Classes — The Big Picture

P (Polynomial Time)

Problems solvable by a deterministic Turing machine in polynomial time: O(n), O(n^2), O(n^3), O(n log n), etc.

Examples: sorting, shortest path (Dijkstra), minimum spanning tree (Kruskal/Prim), bipartite matching, linear programming, maximum flow.

These are the "easy" problems — we have efficient algorithms for them.

NP (Nondeterministic Polynomial Time)

Problems where a given solution (called a "certificate") can be verified in polynomial time.

Key insight: Every problem in P is also in NP. If you can solve a problem in polynomial time, you can certainly verify a solution in polynomial time (just solve it and compare).

The billion-dollar question: Is P = NP? Most computer scientists believe P != NP, but nobody has proven it. This is one of the seven Millennium Prize Problems ($1,000,000 bounty).

Intuition: Think of a Sudoku puzzle. It's hard to solve (you have to search), but if someone gives you a completed grid, you can quickly verify it's correct by checking rows, columns, and boxes. That's NP in a nutshell.

NP-Hard

At least as hard as the hardest problems in NP. Formally: a problem H is NP-hard if every problem in NP can be reduced to H in polynomial time.

  • NP-hard problems are not necessarily in NP (they might not even be decision problems)
  • Example: the halting problem is NP-hard but not in NP (it's undecidable)

NP-Complete

A problem that is both NP and NP-hard. These are the "hardest" problems within NP.

NP-Hard NP NP-Complete P Sorting, Dijkstra SAT, TSP Clique, VertexCover Halting Problem (Assuming P ≠ NP)

Why the diagram matters: If P = NP, the three inner regions collapse into one. If P != NP (the consensus), then P is strictly inside NP, and NP-Complete problems sit at the boundary of NP and NP-Hard.

Why Does This Matter for Interviews?

You need to recognize NP-complete problems because:

  1. You won't find a polynomial-time exact solution — don't waste time trying
  2. You should propose practical approaches: approximation algorithms, exponential solutions with good pruning, or pseudo-polynomial DP
  3. Interviewers may test if you recognize the problem class — saying "this reduces to Set Cover, which is NP-complete" demonstrates strong CS fundamentals
  4. Constraint analysis becomes critical: n <= 20 suggests bitmask DP, n <= 40 suggests meet-in-the-middle, large n suggests greedy/approximation

How to Prove a Problem is NP-Complete

The standard two-step process:

Step 1: Show the problem is in NP

Demonstrate that given a candidate solution (certificate), you can verify it in polynomial time.

Step 2: Show the problem is NP-Hard via reduction

Reduce a known NP-complete problem to your problem in polynomial time.

Reductions — The Key Concept

Reduction: Problem A ≤p Problem B Instance of A polynomial transformation Instance of B answer answer YES / NO YES / NO same answer To prove B is NP-complete: 1. Show B ∈ NP 2. Pick a known NP-complete problem A 3. Show A ≤p B 4. Transformation preserves YES/NO 5. Transformation runs in poly time

The Chain of Reductions

Cook-Levin theorem proved SAT is NP-complete. Then reductions built the chain:

SAT 3-SAT CLIQUE VERTEX COVER INDEPENDENT SET SUBSET SUM 3-COLORING k-COLORING HAMILTONIAN CYCLE TSP SET COVER

Each arrow means "reduces to." Every new NP-complete problem was proven by reducing a known one to it.


The Famous NP-Complete Problems

1. Boolean Satisfiability (SAT) — The First NP-Complete Problem

Problem: Given a boolean formula in conjunctive normal form (CNF), is there an assignment of variables that makes it TRUE?

CNF: A conjunction (AND) of clauses, where each clause is a disjunction (OR) of literals.

Formula: (x₁ ∨ ¬x₂) ∧ (¬x₁ ∨ x₃) ∧ (x₂ ∨ ¬x₃)

Variables: x₁, x₂, x₃ ∈ {TRUE, FALSE}

Try x₁=T, x₂=T, x₃=T:
  Clause 1: (T ∨ F) = T  ✓
  Clause 2: (F ∨ T) = T  ✓
  Clause 3: (T ∨ F) = T  ✓
  Result: T ∧ T ∧ T = TRUE    → SATISFIABLE!

Variants:

  • 2-SAT: Each clause has exactly 2 literals → solvable in polynomial time (implication graph + SCC)
  • 3-SAT: Each clause has exactly 3 literals → NP-complete
  • k-SAT (k >= 3): NP-complete
  • Horn-SAT: At most one positive literal per clause → polynomial time

Why SAT matters: The Cook-Levin theorem (1971) proved SAT was the first NP-complete problem. Every other NP-complete proof ultimately chains back to SAT.

2-SAT: The Polynomial Solvable Case

Since 2-SAT comes up in competitive programming, here's the solution:

Idea: Each clause (a ∨ b) is equivalent to two implications:
  (¬a → b) and (¬b → a)

Build an implication graph. If x and ¬x are in the same SCC,
the formula is unsatisfiable.
CPP
// 2-SAT solver
// Variables 0..n-1. Literal x is 2*x, literal ¬x is 2*x+1.
struct TwoSAT {
    int n;
    vector<vector<int>> adj, radj;
    vector<int> order, comp;
    vector<bool> vis;

    TwoSAT(int n) : n(n), adj(2*n), radj(2*n), comp(2*n), vis(2*n) {}

    // Add clause (a ∨ b). Pass literals, not variables.
    // For variable x: true literal = 2*x, false literal = 2*x+1
    void addClause(int a, int b) {
        adj[a ^ 1].push_back(b);    // ¬a → b
        adj[b ^ 1].push_back(a);    // ¬b → a
        radj[b].push_back(a ^ 1);
        radj[a].push_back(b ^ 1);
    }

    void dfs1(int u) {
        vis[u] = true;
        for (int v : adj[u]) if (!vis[v]) dfs1(v);
        order.push_back(u);
    }

    void dfs2(int u, int c) {
        comp[u] = c;
        for (int v : radj[u]) if (comp[v] == -1) dfs2(v, c);
    }

    // Returns true if satisfiable, fills `values` with assignment
    bool solve(vector<bool>& values) {
        fill(vis.begin(), vis.end(), false);
        for (int i = 0; i < 2*n; i++) if (!vis[i]) dfs1(i);

        fill(comp.begin(), comp.end(), -1);
        int c = 0;
        for (int i = 2*n - 1; i >= 0; i--)
            if (comp[order[i]] == -1) dfs2(order[i], c++);

        values.resize(n);
        for (int i = 0; i < n; i++) {
            if (comp[2*i] == comp[2*i+1]) return false; // x and ¬x in same SCC
            values[i] = comp[2*i] > comp[2*i+1];
        }
        return true;
    }
};

Time complexity: O(V + E) — linear in the size of the formula.


2. Traveling Salesman Problem (TSP)

Problem: Given n cities and distances between every pair of cities, find the shortest tour that visits every city exactly once and returns to the starting city.

Cities and distances:

    A ───10─── B
    |\         |
    | \        |
   20  15     35
    |    \     |
    |     \    |
    C ──25─── D

Distance matrix:
      A    B    C    D
  A [ 0   10   20   15 ]
  B [ 10   0   35   25 ]
  C [ 20  35    0   30 ]
  D [ 15  25   30    0 ]

All possible tours (starting from A):
  A → B → C → D → A : 10 + 35 + 30 + 15 = 90
  A → B → D → C → A : 10 + 25 + 30 + 20 = 85  ← optimal
  A → C → B → D → A : 20 + 35 + 25 + 15 = 95
  A → C → D → B → A : 20 + 30 + 25 + 10 = 85  ← optimal (reverse)
  A → D → B → C → A : 15 + 25 + 35 + 20 = 95
  A → D → C → B → A : 15 + 30 + 35 + 10 = 90

Best tour: cost 85

Brute Force: O(n!)

Try all (n-1)!/2 distinct tours. Completely impractical for n > 12.

DP with Bitmask: O(n^2 * 2^n) — The Standard Interview Approach

Key idea: Use a bitmask to represent the set of visited cities. dp[mask][i] = minimum cost to visit exactly the cities in mask, ending at city i.

State: dp[mask][i]
  - mask: bitmask of visited cities (bit j set means city j visited)
  - i: current city (last city visited)
  - value: minimum cost to reach this state

Transition: dp[mask | (1 << v)][v] = min(dp[mask][u] + dist[u][v])
  for all unvisited cities v

Base case: dp[1][0] = 0 (start at city 0, only city 0 visited)
Answer: min over all i of (dp[ALL][i] + dist[i][0])

Visual trace for 4 cities (A=0, B=1, C=2, D=3):

Step 0: dp[0001][0] = 0          (at A, visited {A})

Step 1: From dp[0001][0]:
  → dp[0011][1] = 0 + 10 = 10   (at B, visited {A,B})
  → dp[0101][2] = 0 + 20 = 20   (at C, visited {A,C})
  → dp[1001][3] = 0 + 15 = 15   (at D, visited {A,D})

Step 2: From dp[0011][1]:
  → dp[0111][2] = 10 + 35 = 45  (at C, visited {A,B,C})
  → dp[1011][3] = 10 + 25 = 35  (at D, visited {A,B,D})
  From dp[0101][2]:
  → dp[0111][1] = 20 + 35 = 55  (worse than 45, skip)
  → dp[1101][3] = 20 + 30 = 50  (at D, visited {A,C,D})
  From dp[1001][3]:
  → dp[1011][1] = 15 + 25 = 40  (worse than 35, skip)
  → dp[1101][2] = 15 + 30 = 45  (better than 50? no, 45 < 50, update!)

Step 3: Final step — visit all 4 cities:
  dp[1111][1] = dp[1101][2] + dist[2][1] = 45 + 35 = 80?
                dp[1001][3] → ... (trace all paths)

  Eventually: dp[1111][i] for each i, then add dist[i][0] to return.
CPP
int tsp(vector<vector<int>>& dist) {
    int n = dist.size();
    int ALL = (1 << n) - 1;

    // dp[mask][i] = min cost to visit cities in mask, ending at city i
    vector<vector<int>> dp(1 << n, vector<int>(n, INT_MAX));

    dp[1][0] = 0; // start at city 0

    for (int mask = 1; mask < (1 << n); mask++) {
        for (int u = 0; u < n; u++) {
            if (dp[mask][u] == INT_MAX) continue;
            if (!(mask & (1 << u))) continue; // u must be in mask

            for (int v = 0; v < n; v++) {
                if (mask & (1 << v)) continue; // v already visited
                int newMask = mask | (1 << v);
                dp[newMask][v] = min(dp[newMask][v],
                                     dp[mask][u] + dist[u][v]);
            }
        }
    }

    // Find min cost tour: visit all cities and return to start
    int ans = INT_MAX;
    for (int u = 0; u < n; u++) {
        if (dp[ALL][u] != INT_MAX) {
            ans = min(ans, dp[ALL][u] + dist[u][0]);
        }
    }
    return ans;
}

Complexity:

  • Time: O(n^2 * 2^n) — for each of 2^n masks and n cities, we try n transitions
  • Space: O(n * 2^n) — the DP table
  • Practical limit: n ~ 20 (2^20 * 20^2 = ~400 million operations)

TSP Approximation Algorithms

For metric TSP (triangle inequality holds: dist(a,c) <= dist(a,b) + dist(b,c)):

1. MST-based 2-approximation:

1. Build MST of the complete graph
2. Double every edge (creates an Eulerian graph)
3. Find Eulerian tour
4. Shortcut repeated vertices

       MST:          Double edges:    Shortcut:
     A─B             A═B              A─B
     |               ║                 \
     C─D             C═D              C─D─A

  Cost ≤ 2 × optimal TSP

2. Christofides algorithm: 1.5-approximation

1. Build MST
2. Find minimum weight perfect matching on odd-degree vertices
3. Combine MST + matching (Eulerian graph)
4. Find Eulerian tour and shortcut

3. Nearest neighbor heuristic (greedy, no guarantee):

CPP
int nearestNeighborTSP(vector<vector<int>>& dist) {
    int n = dist.size();
    vector<bool> visited(n, false);
    int curr = 0, cost = 0;
    visited[0] = true;

    for (int step = 1; step < n; step++) {
        int best = -1, bestDist = INT_MAX;
        for (int j = 0; j < n; j++) {
            if (!visited[j] && dist[curr][j] < bestDist) {
                bestDist = dist[curr][j];
                best = j;
            }
        }
        visited[best] = true;
        cost += bestDist;
        curr = best;
    }
    cost += dist[curr][0]; // return to start
    return cost;
}

3. Knapsack Problem

Problem: Given n items, each with weight w_i and value v_i, and a knapsack with capacity W, maximize the total value of items that fit.

Items: Item 1: w=2, v=6 Item 2: w=3, v=10 Item 3: w=4, v=12 Item 4: w=5, v=15 Knapsack W = 8 I1 w=2 I3 w=4 value = 6 + 12 = 18 weight = 2 + 4 = 6 <= 8 But is there better? 1+2: w=5 v=16 | 1+3: w=6 v=18 | 1+4: w=7 v=21 | 2+3: w=7 v=22 2+4: w=8 v=25 -- BEST! Exactly fits!

Why Knapsack is NP-Complete (Despite Having a "Polynomial" DP)

The DP runs in O(nW), which looks polynomial. But W is a number, not a count. The input size is log₂(W) bits to represent W. So O(nW) = O(n * 2^(bits of W)), which is exponential in the input size. This is called pseudo-polynomial time.

W = 1,000,000,000 (10 digits, ~30 bits)
n = 100 items

DP table: 100 × 1,000,000,000 = 100 billion entries → IMPOSSIBLE

But if W = 1000:
DP table: 100 × 1000 = 100,000 entries → EASY

0/1 Knapsack — Bottom Up DP

dp[w] = maximum value achievable with weight capacity w

For each item i (from 0 to n-1):
  For w from W down to wt[i]:  ← reverse order prevents using item twice!
    dp[w] = max(dp[w], dp[w - wt[i]] + val[i])
CPP
// 0/1 Knapsack — O(nW) time, O(W) space
int knapsack01(vector<int>& wt, vector<int>& val, int W) {
    int n = wt.size();
    vector<int> dp(W + 1, 0);

    for (int i = 0; i < n; i++) {
        // Iterate backwards to ensure each item used at most once
        for (int w = W; w >= wt[i]; w--) {
            dp[w] = max(dp[w], dp[w - wt[i]] + val[i]);
        }
    }
    return dp[W];
}

Why iterate backwards? If we go forward, dp[w - wt[i]] might already include item i from this iteration, effectively using item i multiple times.

Forward (WRONG for 0/1):
  Item: weight=3, value=4
  dp[3] = max(dp[3], dp[0]+4) = 4    ← uses item once
  dp[6] = max(dp[6], dp[3]+4) = 8    ← uses same item AGAIN!

Backward (CORRECT):
  dp[6] = max(dp[6], dp[3]+4)  ← dp[3] is still from previous iteration
  dp[3] = max(dp[3], dp[0]+4)  ← only now we update dp[3]

Unbounded Knapsack

Each item can be used unlimited times. Just iterate forward instead of backward:

CPP
int knapsackUnbounded(vector<int>& wt, vector<int>& val, int W) {
    int n = wt.size();
    vector<int> dp(W + 1, 0);

    for (int w = 0; w <= W; w++) {
        for (int i = 0; i < n; i++) {
            if (wt[i] <= w) {
                dp[w] = max(dp[w], dp[w - wt[i]] + val[i]);
            }
        }
    }
    return dp[W];
}

Knapsack with Item Recovery (Which Items to Take)

CPP
pair<int, vector<int>> knapsackWithItems(vector<int>& wt, vector<int>& val, int W) {
    int n = wt.size();
    vector<vector<int>> dp(n + 1, vector<int>(W + 1, 0));

    for (int i = 1; i <= n; i++) {
        for (int w = 0; w <= W; w++) {
            dp[i][w] = dp[i-1][w]; // don't take item i
            if (wt[i-1] <= w) {
                dp[i][w] = max(dp[i][w], dp[i-1][w - wt[i-1]] + val[i-1]);
            }
        }
    }

    // Backtrack to find which items were taken
    vector<int> items;
    int w = W;
    for (int i = n; i >= 1; i--) {
        if (dp[i][w] != dp[i-1][w]) {
            items.push_back(i - 1); // took item i-1 (0-indexed)
            w -= wt[i-1];
        }
    }

    return {dp[n][W], items};
}

4. Graph Coloring

Problem: Color the vertices of a graph with at most k colors such that no two adjacent vertices share the same color.

Graph:              2-colorable (bipartite)?
  A ─── B              A(R) ─── B(B)
  │     │              │         │
  │     │              │         │
  C ─── D              C(B) ─── D(R)
                       ✓ Yes! This is bipartite.

  Add diagonal A─D:    A(R) ─── B(B)
  A ─── B               │ \      │
  │ \   │               │   \    │
  │   \ │               │     \  │
  C ─── D              C(B) ─── D(?)
                       D is adjacent to A(R), B(B), C(B)
                       D can be R? No, A is R.
                       Need 3 colors: D(G)

Complexity:

  • k=1: Trivial — only if graph has no edges
  • k=2: Bipartite check — BFS/DFS, O(V+E)
  • k=3: NP-complete
  • k>=3: NP-complete

Bipartite Check (2-coloring)

CPP
bool isBipartite(vector<vector<int>>& adj) {
    int n = adj.size();
    vector<int> color(n, -1);

    for (int start = 0; start < n; start++) {
        if (color[start] != -1) continue;

        queue<int> q;
        q.push(start);
        color[start] = 0;

        while (!q.empty()) {
            int u = q.front(); q.pop();
            for (int v : adj[u]) {
                if (color[v] == -1) {
                    color[v] = color[u] ^ 1;
                    q.push(v);
                } else if (color[v] == color[u]) {
                    return false; // odd cycle found
                }
            }
        }
    }
    return true;
}

Graph Coloring via Backtracking (for small graphs)

CPP
bool canColor(vector<vector<int>>& adj, vector<int>& colors, int u, int k) {
    int n = adj.size();
    if (u == n) return true; // all vertices colored

    for (int c = 0; c < k; c++) {
        // Check if color c is valid for vertex u
        bool valid = true;
        for (int v : adj[u]) {
            if (colors[v] == c) { valid = false; break; }
        }
        if (valid) {
            colors[u] = c;
            if (canColor(adj, colors, u + 1, k)) return true;
            colors[u] = -1;
        }
    }
    return false;
}

int chromaticNumber(vector<vector<int>>& adj) {
    int n = adj.size();
    vector<int> colors(n, -1);
    for (int k = 1; k <= n; k++) {
        fill(colors.begin(), colors.end(), -1);
        if (canColor(adj, colors, 0, k)) return k;
    }
    return n; // worst case: every vertex gets unique color
}

5. Subset Sum

Problem: Given a set of integers S and a target T, is there a subset of S that sums to exactly T?

S = {3, 7, 1, 8, 4}, Target = 11

Subsets that sum to 11:
  {3, 8} = 11  ✓
  {7, 4} = 11  ✓
  {3, 1, 4} = 8  ✗
  {3, 7, 1} = 11  ✓

DP Solution: O(n * T) Pseudo-Polynomial

dp[j] = true if we can form sum j using some subset of items seen so far

For each item s in S:
  For j from T down to s:
    dp[j] = dp[j] || dp[j - s]

Base case: dp[0] = true (empty subset sums to 0)
CPP
bool subsetSum(vector<int>& arr, int target) {
    vector<bool> dp(target + 1, false);
    dp[0] = true;

    for (int x : arr) {
        for (int j = target; j >= x; j--) {
            dp[j] = dp[j] || dp[j - x];
        }
    }
    return dp[target];
}

Bitset Optimization: O(n * T / 64)

CPP
bool subsetSumBitset(vector<int>& arr, int target) {
    bitset<100001> dp;
    dp[0] = 1;

    for (int x : arr) {
        dp |= (dp << x); // shift left by x and OR
    }
    return dp[target];
}

Intuition: The bitset represents all achievable sums. Shifting left by x adds x to every achievable sum. OR-ing with the old set keeps previously achievable sums.

Partition Problem (Special Case of Subset Sum)

Problem: Can we partition an array into two subsets with equal sum?

CPP
bool canPartition(vector<int>& nums) {
    int sum = 0;
    for (int x : nums) sum += x;
    if (sum % 2 != 0) return false;

    int target = sum / 2;
    return subsetSum(nums, target);
}

6. Vertex Cover

Problem: Find the minimum set of vertices such that every edge has at least one endpoint in the set.

Graph:          Vertex Cover = {B, C}
  A ─── B           ─── B     B covers edges AB, BD, BE
  │     │\              │\
  │     │ \             │ \
  C ─── D  E       C ── D  E  C covers edges AC, CD

  Every edge is "covered" by at least one vertex in {B, C}.
  Is this minimum?
  Try size 1: No single vertex covers all edges.
  So minimum vertex cover has size 2.

Facts:

  • NP-complete for general graphs
  • Polynomial for bipartite graphs (Konig's theorem: min vertex cover = max matching)
  • 2-approximation exists: greedily pick both endpoints of uncovered edges

2-Approximation for Vertex Cover

CPP
vector<int> approxVertexCover(int n, vector<pair<int,int>>& edges) {
    vector<bool> inCover(n, false);
    vector<bool> edgeCovered(edges.size(), false);

    for (int i = 0; i < edges.size(); i++) {
        if (!edgeCovered[i]) {
            auto [u, v] = edges[i];
            if (!inCover[u] && !inCover[v]) {
                inCover[u] = inCover[v] = true;
                // Mark all edges incident to u or v as covered
                for (int j = i; j < edges.size(); j++) {
                    if (edges[j].first == u || edges[j].first == v ||
                        edges[j].second == u || edges[j].second == v) {
                        edgeCovered[j] = true;
                    }
                }
            }
        }
    }

    vector<int> cover;
    for (int i = 0; i < n; i++)
        if (inCover[i]) cover.push_back(i);
    return cover;
}

Why 2-approximation? Every edge we pick has both endpoints added, but at least one must be in the optimal solution. So we use at most 2x the optimal.


7. Hamiltonian Path/Cycle

Problem: Find a path (or cycle) that visits every vertex exactly once.

Hamiltonian vs Eulerian — DON'T CONFUSE THEM! Hamiltonian Visit every VERTEX once NP-complete No known efficient algorithm A B C D A→B→C→D→A Eulerian Visit every EDGE once Polynomial time! Hierholzer's algorithm O(V+E) Exists iff: Path: 0 or 2 odd-degree vertices Cycle: all vertices even degree

Hamiltonian Path via Bitmask DP: O(n^2 * 2^n)

CPP
// Does a Hamiltonian path exist? (visit all vertices exactly once)
bool hamiltonianPath(vector<vector<int>>& adj) {
    int n = adj.size();
    // dp[mask][i] = can we visit exactly the vertices in mask, ending at i?
    vector<vector<bool>> dp(1 << n, vector<bool>(n, false));

    // Base: single vertex
    for (int i = 0; i < n; i++)
        dp[1 << i][i] = true;

    for (int mask = 1; mask < (1 << n); mask++) {
        for (int u = 0; u < n; u++) {
            if (!dp[mask][u]) continue;
            if (!(mask & (1 << u))) continue;

            for (int v : adj[u]) {
                if (mask & (1 << v)) continue;
                dp[mask | (1 << v)][v] = true;
            }
        }
    }

    int ALL = (1 << n) - 1;
    for (int i = 0; i < n; i++)
        if (dp[ALL][i]) return true;
    return false;
}

8. Clique Problem

Problem: Find the largest complete subgraph (every pair of vertices in the subgraph is connected by an edge).

Graph:                  Largest clique: {A, B, D} (size 3)
  A ─── B
  │\   /│              A─B  ✓
  │  \/  │              A─D  ✓
  │  /\  │              B─D  ✓
  │/   \ │
  D ─── C              {A,B,C,D} is NOT a clique (A─C missing)

  Adjacency:
  A: B, C, D  (wait, is A─C present? Let me redraw)

  Actually, let's use:
  A─B, A─D, B─C, B─D, C─D

  A ── B          Clique {B, C, D}: B─C ✓, B─D ✓, C─D ✓
       │\         Size 3 is maximum.
       │ \
  A    C──D
  |    |
  └────┘ (no, A─C not present)

Relationship to other problems:

  • Clique in graph G = Independent Set in complement of G
  • Independent Set = Vertex Cover complement (V \ vertex_cover = independent_set)

Bron-Kerbosch Algorithm (Finding All Maximal Cliques)

CPP
// Bron-Kerbosch with pivot — standard algorithm for maximal cliques
void bronKerbosch(vector<vector<bool>>& adj, vector<int>& R,
                  vector<int>& P, vector<int>& X,
                  vector<vector<int>>& cliques) {
    if (P.empty() && X.empty()) {
        cliques.push_back(R); // R is a maximal clique
        return;
    }

    // Choose pivot vertex from P ∪ X that maximizes |P ∩ N(pivot)|
    int pivot = -1, maxDeg = -1;
    for (int u : P) {
        int deg = 0;
        for (int v : P) if (adj[u][v]) deg++;
        if (deg > maxDeg) { maxDeg = deg; pivot = u; }
    }
    for (int u : X) {
        int deg = 0;
        for (int v : P) if (adj[u][v]) deg++;
        if (deg > maxDeg) { maxDeg = deg; pivot = u; }
    }

    // Iterate over P \ N(pivot)
    vector<int> candidates;
    for (int v : P)
        if (pivot == -1 || !adj[pivot][v]) candidates.push_back(v);

    for (int v : candidates) {
        // New P: P ∩ N(v)
        vector<int> newP, newX;
        for (int u : P) if (adj[v][u]) newP.push_back(u);
        for (int u : X) if (adj[v][u]) newX.push_back(u);

        R.push_back(v);
        bronKerbosch(adj, R, newP, newX, cliques);
        R.pop_back();

        P.erase(find(P.begin(), P.end(), v));
        X.push_back(v);
    }
}

9. Set Cover

Problem: Given a universe U and a collection of sets S₁, S₂, ..., Sₘ, find the minimum number of sets whose union equals U.

Universe U = {1, 2, 3, 4, 5, 6, 7}
Sets:
  S₁ = {1, 2, 3}
  S₂ = {2, 4}
  S₃ = {3, 4, 5}
  S₄ = {4, 5, 6, 7}
  S₅ = {6, 7}

Greedy: pick the set covering the most uncovered elements
  Pick S₄ = {4,5,6,7}  → uncovered = {1,2,3}
  Pick S₁ = {1,2,3}    → uncovered = {} DONE!

Answer: 2 sets (S₁, S₄) — this happens to be optimal here.

Greedy Set Cover: O(ln(n))-approximation

CPP
vector<int> greedySetCover(int n, vector<vector<int>>& sets) {
    vector<bool> covered(n + 1, false);
    int uncovered = n;
    vector<int> chosen;
    vector<bool> used(sets.size(), false);

    while (uncovered > 0) {
        // Find the set that covers the most uncovered elements
        int best = -1, bestCount = 0;
        for (int i = 0; i < sets.size(); i++) {
            if (used[i]) continue;
            int count = 0;
            for (int elem : sets[i])
                if (!covered[elem]) count++;
            if (count > bestCount) {
                bestCount = count;
                best = i;
            }
        }

        if (best == -1) break; // can't cover more

        used[best] = true;
        chosen.push_back(best);
        for (int elem : sets[best]) {
            if (!covered[elem]) {
                covered[elem] = true;
                uncovered--;
            }
        }
    }
    return chosen;
}

Approximation ratio: ln(n) + 1 where n = |U|. This is the best possible unless P = NP.


Recognizing NP-Complete Problems in Disguise

The interviewer will never say "solve this NP-complete problem." They disguise it:

Disguise Real Problem "Visit all cities cheaply" TSP "Pack items in a suitcase" Knapsack "Split array into two equal parts" Partition / Subset Sum "Color a schedule with min slots" Graph Coloring "Cover all skills with min people" Set Cover "Find the longest simple path" Hamiltonian Path "Find the largest friend group" Clique "Assign tasks with constraints" SAT / CSP "Schedule jobs on machines" Job Scheduling (NPC) "Minimum routers to cover network" Vertex Cover "Longest common subsequence (3+)" NP-hard "Exact cover" Exact Cover (NPC)

Red Flags That Suggest NP-Completeness

  1. The problem asks for an optimal solution over all subsets or all permutations
  2. The problem has a "select or don't select" structure for each element
  3. Small constraints (n ≤ 20) — the problem setter knows it's exponential
  4. The problem seems like it should be easy but you can't find a polynomial solution
  5. There's no obvious greedy strategy that works, and DP states blow up

What to Do in an Interview When You Recognize NP-Completeness

Step-by-step response strategy:

1. State it clearly:

"This problem is a variant of [known NP-complete problem], which means there's no known polynomial-time algorithm."

2. Ask about constraints:

  • n ≤ 15-20 → bitmask DP, O(2^n * n) or O(2^n * n^2)
  • n ≤ 40 → meet-in-the-middle, O(2^(n/2) * n)
  • n is large → greedy approximation, or pseudo-polynomial DP if applicable

3. Propose the right approach based on constraints:

n ≤ 10:     Brute force / backtracking with pruning
n ≤ 20:     Bitmask DP — O(2^n * poly(n))
n ≤ 40:     Meet in the middle — O(2^(n/2) * poly(n))
n ≤ 10^3:   Pseudo-polynomial DP (if applicable, e.g., knapsack with small W)
n ≤ 10^6:   Greedy approximation, heuristics

Practical Solutions for Interview Problems

Meet in the Middle — O(2^(n/2))

Idea: Split the input into two halves. Enumerate all 2^(n/2) subsets of each half. Combine results intelligently.

Input: n=36 elements Brute force: 2^36 ~ 69 billion (TOO SLOW) | Meet in middle: 2 x 2^18 ~ 524K (FAST!) First 18 elements Last 18 elements Generate all 2^18 sums Generate all 2^18 sums Combine For each sum s in first half, check if (target - s) exists in second half (binary search)
CPP
// Subset Sum with n up to 40
bool subsetSumMeetInMiddle(vector<int>& arr, long long target) {
    int n = arr.size();
    int half = n / 2;
    int rest = n - half;

    // Generate all subset sums of first half
    vector<long long> firstSums;
    for (int mask = 0; mask < (1 << half); mask++) {
        long long sum = 0;
        for (int i = 0; i < half; i++)
            if (mask & (1 << i)) sum += arr[i];
        firstSums.push_back(sum);
    }
    sort(firstSums.begin(), firstSums.end());

    // For each subset sum of second half, binary search in first half
    for (int mask = 0; mask < (1 << rest); mask++) {
        long long sum = 0;
        for (int i = 0; i < rest; i++)
            if (mask & (1 << i)) sum += arr[half + i];

        long long need = target - sum;
        if (binary_search(firstSums.begin(), firstSums.end(), need))
            return true;
    }
    return false;
}

Time: O(2^(n/2) * n) for generation + O(2^(n/2) * log(2^(n/2))) for search = O(n * 2^(n/2))

Backtracking with Pruning

For many NP-complete problems, smart pruning makes brute force feasible for moderate inputs.

CPP
// Maximum Clique via backtracking with pruning
int maxClique;

void findClique(vector<vector<bool>>& adj, vector<int>& clique,
                vector<int>& candidates, int n) {
    if (candidates.empty()) {
        maxClique = max(maxClique, (int)clique.size());
        return;
    }

    for (int i = 0; i < candidates.size(); i++) {
        // Pruning: even if we add all remaining candidates, can we beat best?
        if ((int)clique.size() + (int)candidates.size() - i <= maxClique)
            return; // bound: can't improve

        int v = candidates[i];

        // New candidates: only vertices adjacent to v
        vector<int> newCandidates;
        for (int j = i + 1; j < candidates.size(); j++) {
            if (adj[v][candidates[j]])
                newCandidates.push_back(candidates[j]);
        }

        clique.push_back(v);
        findClique(adj, clique, newCandidates, n);
        clique.pop_back();
    }
}

Branch and Bound (Knapsack)

CPP
struct Item { int weight, value; double ratio; };

int bestValue;

void knapsackBnB(vector<Item>& items, int idx, int currWeight,
                 int currValue, int W) {
    if (currWeight > W) return;
    bestValue = max(bestValue, currValue);
    if (idx == items.size()) return;

    // Upper bound: fractional relaxation of remaining items
    double bound = currValue;
    int remWeight = W - currWeight;
    for (int i = idx; i < items.size() && remWeight > 0; i++) {
        if (items[i].weight <= remWeight) {
            bound += items[i].value;
            remWeight -= items[i].weight;
        } else {
            bound += items[i].ratio * remWeight;
            break;
        }
    }

    if (bound <= bestValue) return; // pruning!

    // Take item idx
    knapsackBnB(items, idx + 1, currWeight + items[idx].weight,
                currValue + items[idx].value, W);
    // Skip item idx
    knapsackBnB(items, idx + 1, currWeight, currValue, W);
}

Summary: Decision Tree for NP-Complete Problems in Interviews

Recognize the problem type NP-complete / NP-hard? NO Use standard algo YES Check constraints n ≤ 20 Bitmask DP n ≤ 40 Meet in the middle n ≤ 100 Backtracking + pruning Vals small Pseudo-poly DP n large Greedy approx

Key takeaway: The ability to recognize NP-complete problems and immediately pivot to the right approach (based on constraints) is what separates strong candidates from average ones. Never spend an interview trying to find a polynomial algorithm for an NP-complete problem.


Quick Reference: NP-Complete Problems Cheat Sheet

Problem Best Exact Best Approx Reduction From SAT O(2^n) MAX-SAT: 3/4-approx (Cook-Levin) 3-SAT O(1.33^n) 7/8-approx SAT TSP O(n^2 * 2^n) 1.5x (Christofides) Hamiltonian Cycle Knapsack O(nW) pseudo-poly FPTAS exists Subset Sum Graph Coloring (k>=3) O(k^n) backtrack No constant approx 3-SAT Subset Sum O(nT) pseudo-poly FPTAS exists SAT Vertex Cover O(2^k * n) 2-approx Independent Set Hamiltonian Path O(n^2 * 2^n) -- 3-SAT Clique O(2^n) backtrack No constant approx Independent Set Set Cover O(2^m * n) ln(n)-approx Vertex Cover

FPTAS = Fully Polynomial-Time Approximation Scheme: for any epsilon > 0, gets within (1+epsilon) factor in polynomial time.