NP-Complete Problems (TSP, Knapsack, etc.)

Recognizing NP-complete problems is critical — if you spot one, pivot immediately to approximation algorithms, exponential solutions with pruning, or pseudo-polynomial DP.

Complexity Classes — The Big Picture

P (Polynomial Time)

Problems solvable by a deterministic Turing machine in polynomial time: O(n), O(n^2), O(n^3), O(n log n), etc.

Examples: sorting, shortest path (Dijkstra), minimum spanning tree (Kruskal/Prim), bipartite matching, linear programming, maximum flow.

These are the "easy" problems — we have efficient algorithms for them.

NP (Nondeterministic Polynomial Time)

Problems where a given solution (called a "certificate") can be verified in polynomial time.

Key insight: Every problem in P is also in NP. If you can solve a problem in polynomial time, you can certainly verify a solution in polynomial time (just solve it and compare).

The billion-dollar question: Is P = NP? Most computer scientists believe P != NP, but nobody has proven it. This is one of the seven Millennium Prize Problems ($1,000,000 bounty).

Intuition: Think of a Sudoku puzzle. It's hard to solve (you have to search), but if someone gives you a completed grid, you can quickly verify it's correct by checking rows, columns, and boxes. That's NP in a nutshell.

NP-Hard

At least as hard as the hardest problems in NP. Formally: a problem H is NP-hard if every problem in NP can be reduced to H in polynomial time.

NP-hard problems are not necessarily in NP (they might not even be decision problems)
Example: the halting problem is NP-hard but not in NP (it's undecidable)

NP-Complete

A problem that is both NP and NP-hard. These are the "hardest" problems within NP.

Why the diagram matters: If P = NP, the three inner regions collapse into one. If P != NP (the consensus), then P is strictly inside NP, and NP-Complete problems sit at the boundary of NP and NP-Hard.

Why Does This Matter for Interviews?

You need to recognize NP-complete problems because:

You won't find a polynomial-time exact solution — don't waste time trying
You should propose practical approaches: approximation algorithms, exponential solutions with good pruning, or pseudo-polynomial DP
Interviewers may test if you recognize the problem class — saying "this reduces to Set Cover, which is NP-complete" demonstrates strong CS fundamentals
Constraint analysis becomes critical: n <= 20 suggests bitmask DP, n <= 40 suggests meet-in-the-middle, large n suggests greedy/approximation

How to Prove a Problem is NP-Complete

The standard two-step process:

Step 1: Show the problem is in NP

Demonstrate that given a candidate solution (certificate), you can verify it in polynomial time.

Step 2: Show the problem is NP-Hard via reduction

Reduce a known NP-complete problem to your problem in polynomial time.

Reductions — The Key Concept

The Chain of Reductions

Cook-Levin theorem proved SAT is NP-complete. Then reductions built the chain:

Each arrow means "reduces to." Every new NP-complete problem was proven by reducing a known one to it.

The Famous NP-Complete Problems

1. Boolean Satisfiability (SAT) — The First NP-Complete Problem

Problem: Given a boolean formula in conjunctive normal form (CNF), is there an assignment of variables that makes it TRUE?

CNF: A conjunction (AND) of clauses, where each clause is a disjunction (OR) of literals.

Formula: (x₁ ∨ ¬x₂) ∧ (¬x₁ ∨ x₃) ∧ (x₂ ∨ ¬x₃)

Variables: x₁, x₂, x₃ ∈ {TRUE, FALSE}

Try x₁=T, x₂=T, x₃=T:
  Clause 1: (T ∨ F) = T  ✓
  Clause 2: (F ∨ T) = T  ✓
  Clause 3: (T ∨ F) = T  ✓
  Result: T ∧ T ∧ T = TRUE    → SATISFIABLE!

Variants:

2-SAT: Each clause has exactly 2 literals → solvable in polynomial time (implication graph + SCC)
3-SAT: Each clause has exactly 3 literals → NP-complete
k-SAT (k >= 3): NP-complete
Horn-SAT: At most one positive literal per clause → polynomial time

Why SAT matters: The Cook-Levin theorem (1971) proved SAT was the first NP-complete problem. Every other NP-complete proof ultimately chains back to SAT.

2-SAT: The Polynomial Solvable Case

Since 2-SAT comes up in competitive programming, here's the solution:

Idea: Each clause (a ∨ b) is equivalent to two implications:
  (¬a → b) and (¬b → a)

Build an implication graph. If x and ¬x are in the same SCC,
the formula is unsatisfiable.

// 2-SAT solver
// Variables 0..n-1. Literal x is 2*x, literal ¬x is 2*x+1.
struct TwoSAT {
    int n;
    vector<vector<int>> adj, radj;
    vector<int> order, comp;
    vector<bool> vis;

    TwoSAT(int n) : n(n), adj(2*n), radj(2*n), comp(2*n), vis(2*n) {}

    // Add clause (a ∨ b). Pass literals, not variables.
    // For variable x: true literal = 2*x, false literal = 2*x+1
    void addClause(int a, int b) {
        adj[a ^ 1].push_back(b);    // ¬a → b
        adj[b ^ 1].push_back(a);    // ¬b → a
        radj[b].push_back(a ^ 1);
        radj[a].push_back(b ^ 1);
    }

    void dfs1(int u) {
        vis[u] = true;
        for (int v : adj[u]) if (!vis[v]) dfs1(v);
        order.push_back(u);
    }

    void dfs2(int u, int c) {
        comp[u] = c;
        for (int v : radj[u]) if (comp[v] == -1) dfs2(v, c);
    }

    // Returns true if satisfiable, fills `values` with assignment
    bool solve(vector<bool>& values) {
        fill(vis.begin(), vis.end(), false);
        for (int i = 0; i < 2*n; i++) if (!vis[i]) dfs1(i);

        fill(comp.begin(), comp.end(), -1);
        int c = 0;
        for (int i = 2*n - 1; i >= 0; i--)
            if (comp[order[i]] == -1) dfs2(order[i], c++);

        values.resize(n);
        for (int i = 0; i < n; i++) {
            if (comp[2*i] == comp[2*i+1]) return false; // x and ¬x in same SCC
            values[i] = comp[2*i] > comp[2*i+1];
        }
        return true;
    }
};

Time complexity: O(V + E) — linear in the size of the formula.

2. Traveling Salesman Problem (TSP)

Problem: Given n cities and distances between every pair of cities, find the shortest tour that visits every city exactly once and returns to the starting city.

Cities and distances:

    A ───10─── B
    |\         |
    | \        |
   20  15     35
    |    \     |
    |     \    |
    C ──25─── D

Distance matrix:
      A    B    C    D
  A [ 0   10   20   15 ]
  B [ 10   0   35   25 ]
  C [ 20  35    0   30 ]
  D [ 15  25   30    0 ]

All possible tours (starting from A):
  A → B → C → D → A : 10 + 35 + 30 + 15 = 90
  A → B → D → C → A : 10 + 25 + 30 + 20 = 85  ← optimal
  A → C → B → D → A : 20 + 35 + 25 + 15 = 95
  A → C → D → B → A : 20 + 30 + 25 + 10 = 85  ← optimal (reverse)
  A → D → B → C → A : 15 + 25 + 35 + 20 = 95
  A → D → C → B → A : 15 + 30 + 35 + 10 = 90

Best tour: cost 85

Brute Force: O(n!)

Try all (n-1)!/2 distinct tours. Completely impractical for n > 12.

DP with Bitmask: O(n^2 * 2^n) — The Standard Interview Approach

Key idea: Use a bitmask to represent the set of visited cities. dp[mask][i] = minimum cost to visit exactly the cities in mask, ending at city i.

State: dp[mask][i]
  - mask: bitmask of visited cities (bit j set means city j visited)
  - i: current city (last city visited)
  - value: minimum cost to reach this state

Transition: dp[mask | (1 << v)][v] = min(dp[mask][u] + dist[u][v])
  for all unvisited cities v

Base case: dp[1][0] = 0 (start at city 0, only city 0 visited)
Answer: min over all i of (dp[ALL][i] + dist[i][0])

Visual trace for 4 cities (A=0, B=1, C=2, D=3):

Step 0: dp[0001][0] = 0          (at A, visited {A})

Step 1: From dp[0001][0]:
  → dp[0011][1] = 0 + 10 = 10   (at B, visited {A,B})
  → dp[0101][2] = 0 + 20 = 20   (at C, visited {A,C})
  → dp[1001][3] = 0 + 15 = 15   (at D, visited {A,D})

Step 2: From dp[0011][1]:
  → dp[0111][2] = 10 + 35 = 45  (at C, visited {A,B,C})
  → dp[1011][3] = 10 + 25 = 35  (at D, visited {A,B,D})
  From dp[0101][2]:
  → dp[0111][1] = 20 + 35 = 55  (worse than 45, skip)
  → dp[1101][3] = 20 + 30 = 50  (at D, visited {A,C,D})
  From dp[1001][3]:
  → dp[1011][1] = 15 + 25 = 40  (worse than 35, skip)
  → dp[1101][2] = 15 + 30 = 45  (45 < 50, better! update dp[1101][2] = 45)

Step 3: Final step — visit all 4 cities:
  dp[1111][1] = dp[1101][2] + dist[2][1] = 45 + 35 = 80?
                dp[1001][3] → ... (trace all paths)

  Eventually: dp[1111][i] for each i, then add dist[i][0] to return.

int tsp(vector<vector<int>>& dist) {
    int n = dist.size();
    int ALL = (1 << n) - 1;

    // dp[mask][i] = min cost to visit cities in mask, ending at city i
    vector<vector<int>> dp(1 << n, vector<int>(n, INT_MAX));

    dp[1][0] = 0; // start at city 0

    for (int mask = 1; mask < (1 << n); mask++) {
        for (int u = 0; u < n; u++) {
            if (dp[mask][u] == INT_MAX) continue;
            if (!(mask & (1 << u))) continue; // u must be in mask

            for (int v = 0; v < n; v++) {
                if (mask & (1 << v)) continue; // v already visited
                int newMask = mask | (1 << v);
                dp[newMask][v] = min(dp[newMask][v],
                                     dp[mask][u] + dist[u][v]);
            }
        }
    }

    // Find min cost tour: visit all cities and return to start
    int ans = INT_MAX;
    for (int u = 0; u < n; u++) {
        if (dp[ALL][u] != INT_MAX) {
            ans = min(ans, dp[ALL][u] + dist[u][0]);
        }
    }
    return ans;
}

Complexity:

Time: O(n^2 * 2^n) — for each of 2^n masks and n cities, we try n transitions
Space: O(n * 2^n) — the DP table
Practical limit: n ~ 20 (2^20 * 20^2 = ~400 million operations)

TSP Approximation Algorithms

For metric TSP (triangle inequality holds: dist(a,c) <= dist(a,b) + dist(b,c)):

1. MST-based 2-approximation:

1. Build MST of the complete graph
2. Double every edge (creates an Eulerian graph)
3. Find Eulerian tour
4. Shortcut repeated vertices

       MST:          Double edges:    Shortcut:
     A─B             A═B              A─B
     |               ║                 \
     C─D             C═D              C─D─A

  Cost ≤ 2 × optimal TSP

2. Christofides algorithm: 1.5-approximation

1. Build MST
2. Find minimum weight perfect matching on odd-degree vertices
3. Combine MST + matching (Eulerian graph)
4. Find Eulerian tour and shortcut

3. Nearest neighbor heuristic (greedy, no guarantee):

int nearestNeighborTSP(vector<vector<int>>& dist) {
    int n = dist.size();
    vector<bool> visited(n, false);
    int curr = 0, cost = 0;
    visited[0] = true;

    for (int step = 1; step < n; step++) {
        int best = -1, bestDist = INT_MAX;
        for (int j = 0; j < n; j++) {
            if (!visited[j] && dist[curr][j] < bestDist) {
                bestDist = dist[curr][j];
                best = j;
            }
        }
        visited[best] = true;
        cost += bestDist;
        curr = best;
    }
    cost += dist[curr][0]; // return to start
    return cost;
}

3. Knapsack Problem

Problem: Given n items, each with weight w_i and value v_i, and a knapsack with capacity W, maximize the total value of items that fit.

Why Knapsack is NP-Complete (Despite Having a "Polynomial" DP)

The DP runs in O(nW), which looks polynomial. But W is a number, not a count. The input size is log₂(W) bits to represent W. So O(nW) = O(n * 2^(bits of W)), which is exponential in the input size. This is called pseudo-polynomial time.

W = 1,000,000,000 (10 digits, ~30 bits)
n = 100 items

DP table: 100 × 1,000,000,000 = 100 billion entries → IMPOSSIBLE

But if W = 1000:
DP table: 100 × 1000 = 100,000 entries → EASY

0/1 Knapsack — Bottom Up DP

dp[w] = maximum value achievable with weight capacity w

For each item i (from 0 to n-1):
  For w from W down to wt[i]:  ← reverse order prevents using item twice!
    dp[w] = max(dp[w], dp[w - wt[i]] + val[i])

// 0/1 Knapsack — O(nW) time, O(W) space
int knapsack01(vector<int>& wt, vector<int>& val, int W) {
    int n = wt.size();
    vector<int> dp(W + 1, 0);

    for (int i = 0; i < n; i++) {
        // Iterate backwards to ensure each item used at most once
        for (int w = W; w >= wt[i]; w--) {
            dp[w] = max(dp[w], dp[w - wt[i]] + val[i]);
        }
    }
    return dp[W];
}

Why iterate backwards? If we go forward, dp[w - wt[i]] might already include item i from this iteration, effectively using item i multiple times.

Forward (WRONG for 0/1):
  Item: weight=3, value=4
  dp[3] = max(dp[3], dp[0]+4) = 4    ← uses item once
  dp[6] = max(dp[6], dp[3]+4) = 8    ← uses same item AGAIN!

Backward (CORRECT):
  dp[6] = max(dp[6], dp[3]+4)  ← dp[3] is still from previous iteration
  dp[3] = max(dp[3], dp[0]+4)  ← only now we update dp[3]

Unbounded Knapsack

Each item can be used unlimited times. Just iterate forward instead of backward:

int knapsackUnbounded(vector<int>& wt, vector<int>& val, int W) {
    int n = wt.size();
    vector<int> dp(W + 1, 0);

    for (int w = 0; w <= W; w++) {
        for (int i = 0; i < n; i++) {
            if (wt[i] <= w) {
                dp[w] = max(dp[w], dp[w - wt[i]] + val[i]);
            }
        }
    }
    return dp[W];
}

Knapsack with Item Recovery (Which Items to Take)

pair<int, vector<int>> knapsackWithItems(vector<int>& wt, vector<int>& val, int W) {
    int n = wt.size();
    vector<vector<int>> dp(n + 1, vector<int>(W + 1, 0));

    for (int i = 1; i <= n; i++) {
        for (int w = 0; w <= W; w++) {
            dp[i][w] = dp[i-1][w]; // don't take item i
            if (wt[i-1] <= w) {
                dp[i][w] = max(dp[i][w], dp[i-1][w - wt[i-1]] + val[i-1]);
            }
        }
    }

    // Backtrack to find which items were taken
    vector<int> items;
    int w = W;
    for (int i = n; i >= 1; i--) {
        if (dp[i][w] != dp[i-1][w]) {
            items.push_back(i - 1); // took item i-1 (0-indexed)
            w -= wt[i-1];
        }
    }

    return {dp[n][W], items};
}

4. Graph Coloring

Problem: Color the vertices of a graph with at most k colors such that no two adjacent vertices share the same color.

Graph:              2-colorable (bipartite)?
  A ─── B              A(R) ─── B(B)
  │     │              │         │
  │     │              │         │
  C ─── D              C(B) ─── D(R)
                       ✓ Yes! This is bipartite.

  Add diagonal A─D:    A(R) ─── B(B)
  A ─── B               │ \      │
  │ \   │               │   \    │
  │   \ │               │     \  │
  C ─── D              C(B) ─── D(?)
                       D is adjacent to A(R), B(B), C(B)
                       D can be R? No, A is R.
                       Need 3 colors: D(G)

Complexity:

k=1: Trivial — only if graph has no edges
k=2: Bipartite check — BFS/DFS, O(V+E)
k=3: NP-complete
k>=3: NP-complete

Bipartite Check (2-coloring)

bool isBipartite(vector<vector<int>>& adj) {
    int n = adj.size();
    vector<int> color(n, -1);

    for (int start = 0; start < n; start++) {
        if (color[start] != -1) continue;

        queue<int> q;
        q.push(start);
        color[start] = 0;

        while (!q.empty()) {
            int u = q.front(); q.pop();
            for (int v : adj[u]) {
                if (color[v] == -1) {
                    color[v] = color[u] ^ 1;
                    q.push(v);
                } else if (color[v] == color[u]) {
                    return false; // odd cycle found
                }
            }
        }
    }
    return true;
}

Graph Coloring via Backtracking (for small graphs)

bool canColor(vector<vector<int>>& adj, vector<int>& colors, int u, int k) {
    int n = adj.size();
    if (u == n) return true; // all vertices colored

    for (int c = 0; c < k; c++) {
        // Check if color c is valid for vertex u
        bool valid = true;
        for (int v : adj[u]) {
            if (colors[v] == c) { valid = false; break; }
        }
        if (valid) {
            colors[u] = c;
            if (canColor(adj, colors, u + 1, k)) return true;
            colors[u] = -1;
        }
    }
    return false;
}

int chromaticNumber(vector<vector<int>>& adj) {
    int n = adj.size();
    vector<int> colors(n, -1);
    for (int k = 1; k <= n; k++) {
        fill(colors.begin(), colors.end(), -1);
        if (canColor(adj, colors, 0, k)) return k;
    }
    return n; // worst case: every vertex gets unique color
}

5. Subset Sum

Problem: Given a set of integers S and a target T, is there a subset of S that sums to exactly T?

S = {3, 7, 1, 8, 4}, Target = 11

Subsets that sum to 11:
  {3, 8} = 11  ✓
  {7, 4} = 11  ✓
  {3, 1, 4} = 8  ✗
  {3, 7, 1} = 11  ✓

DP Solution: O(n * T) Pseudo-Polynomial

dp[j] = true if we can form sum j using some subset of items seen so far

For each item s in S:
  For j from T down to s:
    dp[j] = dp[j] || dp[j - s]

Base case: dp[0] = true (empty subset sums to 0)

bool subsetSum(vector<int>& arr, int target) {
    vector<bool> dp(target + 1, false);
    dp[0] = true;

    for (int x : arr) {
        for (int j = target; j >= x; j--) {
            dp[j] = dp[j] || dp[j - x];
        }
    }
    return dp[target];
}

Bitset Optimization: O(n * T / 64)

bool subsetSumBitset(vector<int>& arr, int target) {
    bitset<100001> dp;
    dp[0] = 1;

    for (int x : arr) {
        dp |= (dp << x); // shift left by x and OR
    }
    return dp[target];
}

Intuition: The bitset represents all achievable sums. Shifting left by x adds x to every achievable sum. OR-ing with the old set keeps previously achievable sums.

Partition Problem (Special Case of Subset Sum)

Problem: Can we partition an array into two subsets with equal sum?

bool canPartition(vector<int>& nums) {
    int sum = 0;
    for (int x : nums) sum += x;
    if (sum % 2 != 0) return false;

    int target = sum / 2;
    return subsetSum(nums, target);
}

6. Vertex Cover

Problem: Find the minimum set of vertices such that every edge has at least one endpoint in the set.

Graph:          Vertex Cover = {B, C}
  A ─── B           ─── B     B covers edges AB, BD, BE
  │     │\              │\
  │     │ \             │ \
  C ─── D  E       C ── D  E  C covers edges AC, CD

  Every edge is "covered" by at least one vertex in {B, C}.
  Is this minimum?
  Try size 1: No single vertex covers all edges.
  So minimum vertex cover has size 2.

Facts:

NP-complete for general graphs
Polynomial for bipartite graphs (Konig's theorem: min vertex cover = max matching)
2-approximation exists: greedily pick both endpoints of uncovered edges

2-Approximation for Vertex Cover

vector<int> approxVertexCover(int n, vector<pair<int,int>>& edges) {
    vector<bool> inCover(n, false);
    vector<bool> edgeCovered(edges.size(), false);

    for (int i = 0; i < edges.size(); i++) {
        if (!edgeCovered[i]) {
            auto [u, v] = edges[i];
            if (!inCover[u] && !inCover[v]) {
                inCover[u] = inCover[v] = true;
                // Mark all edges incident to u or v as covered
                for (int j = i; j < edges.size(); j++) {
                    if (edges[j].first == u || edges[j].first == v ||
                        edges[j].second == u || edges[j].second == v) {
                        edgeCovered[j] = true;
                    }
                }
            }
        }
    }

    vector<int> cover;
    for (int i = 0; i < n; i++)
        if (inCover[i]) cover.push_back(i);
    return cover;
}

Why 2-approximation? Every edge we pick has both endpoints added, but at least one must be in the optimal solution. So we use at most 2x the optimal.

7. Hamiltonian Path/Cycle

Problem: Find a path (or cycle) that visits every vertex exactly once.

Hamiltonian Path via Bitmask DP: O(n^2 * 2^n)

// Does a Hamiltonian path exist? (visit all vertices exactly once)
bool hamiltonianPath(vector<vector<int>>& adj) {
    int n = adj.size();
    // dp[mask][i] = can we visit exactly the vertices in mask, ending at i?
    vector<vector<bool>> dp(1 << n, vector<bool>(n, false));

    // Base: single vertex
    for (int i = 0; i < n; i++)
        dp[1 << i][i] = true;

    for (int mask = 1; mask < (1 << n); mask++) {
        for (int u = 0; u < n; u++) {
            if (!dp[mask][u]) continue;
            if (!(mask & (1 << u))) continue;

            for (int v : adj[u]) {
                if (mask & (1 << v)) continue;
                dp[mask | (1 << v)][v] = true;
            }
        }
    }

    int ALL = (1 << n) - 1;
    for (int i = 0; i < n; i++)
        if (dp[ALL][i]) return true;
    return false;
}

8. Clique Problem

Problem: Find the largest complete subgraph (every pair of vertices in the subgraph is connected by an edge).

Graph:
  Edges: A─B, A─D, B─C, B─D, C─D

  A ── B
  |   /|\
  |  / | \
  | /  |  \
  D ── C

  Largest clique: {B, C, D} (size 3)
    B─C ✓, B─D ✓, C─D ✓

  {A,B,C,D} is NOT a clique (A─C edge missing).

Relationship to other problems:

Clique in graph G = Independent Set in complement of G
Independent Set = Vertex Cover complement (V \ vertex_cover = independent_set)

Bron-Kerbosch Algorithm (Finding All Maximal Cliques)

// Bron-Kerbosch with pivot — standard algorithm for maximal cliques
void bronKerbosch(vector<vector<bool>>& adj, vector<int>& R,
                  vector<int>& P, vector<int>& X,
                  vector<vector<int>>& cliques) {
    if (P.empty() && X.empty()) {
        cliques.push_back(R); // R is a maximal clique
        return;
    }

    // Choose pivot vertex from P ∪ X that maximizes |P ∩ N(pivot)|
    int pivot = -1, maxDeg = -1;
    for (int u : P) {
        int deg = 0;
        for (int v : P) if (adj[u][v]) deg++;
        if (deg > maxDeg) { maxDeg = deg; pivot = u; }
    }
    for (int u : X) {
        int deg = 0;
        for (int v : P) if (adj[u][v]) deg++;
        if (deg > maxDeg) { maxDeg = deg; pivot = u; }
    }

    // Iterate over P \ N(pivot)
    vector<int> candidates;
    for (int v : P)
        if (pivot == -1 || !adj[pivot][v]) candidates.push_back(v);

    for (int v : candidates) {
        // New P: P ∩ N(v)
        vector<int> newP, newX;
        for (int u : P) if (adj[v][u]) newP.push_back(u);
        for (int u : X) if (adj[v][u]) newX.push_back(u);

        R.push_back(v);
        bronKerbosch(adj, R, newP, newX, cliques);
        R.pop_back();

        P.erase(find(P.begin(), P.end(), v));
        X.push_back(v);
    }
}

9. Set Cover

Problem: Given a universe U and a collection of sets S₁, S₂, ..., Sₘ, find the minimum number of sets whose union equals U.

Universe U = {1, 2, 3, 4, 5, 6, 7}
Sets:
  S₁ = {1, 2, 3}
  S₂ = {2, 4}
  S₃ = {3, 4, 5}
  S₄ = {4, 5, 6, 7}
  S₅ = {6, 7}

Greedy: pick the set covering the most uncovered elements
  Pick S₄ = {4,5,6,7}  → uncovered = {1,2,3}
  Pick S₁ = {1,2,3}    → uncovered = {} DONE!

Answer: 2 sets (S₁, S₄) — this happens to be optimal here.

Greedy Set Cover: O(ln(n))-approximation

vector<int> greedySetCover(int n, vector<vector<int>>& sets) {
    vector<bool> covered(n + 1, false);
    int uncovered = n;
    vector<int> chosen;
    vector<bool> used(sets.size(), false);

    while (uncovered > 0) {
        // Find the set that covers the most uncovered elements
        int best = -1, bestCount = 0;
        for (int i = 0; i < sets.size(); i++) {
            if (used[i]) continue;
            int count = 0;
            for (int elem : sets[i])
                if (!covered[elem]) count++;
            if (count > bestCount) {
                bestCount = count;
                best = i;
            }
        }

        if (best == -1) break; // can't cover more

        used[best] = true;
        chosen.push_back(best);
        for (int elem : sets[best]) {
            if (!covered[elem]) {
                covered[elem] = true;
                uncovered--;
            }
        }
    }
    return chosen;
}

Approximation ratio: ln(n) + 1 where n = |U|. This is the best possible unless P = NP.

Recognizing NP-Complete Problems in Disguise

The interviewer will never say "solve this NP-complete problem." They disguise it:

Red Flags That Suggest NP-Completeness

The problem asks for an optimal solution over all subsets or all permutations
The problem has a "select or don't select" structure for each element
Small constraints (n ≤ 20) — the problem setter knows it's exponential
The problem seems like it should be easy but you can't find a polynomial solution
There's no obvious greedy strategy that works, and DP states blow up

What to Do in an Interview When You Recognize NP-Completeness

Step-by-step response strategy:

1. State it clearly:

"This problem is a variant of [known NP-complete problem], which means there's no known polynomial-time algorithm."

2. Ask about constraints:

n ≤ 15-20 → bitmask DP, O(2^n * n) or O(2^n * n^2)
n ≤ 40 → meet-in-the-middle, O(2^(n/2) * n)
n is large → greedy approximation, or pseudo-polynomial DP if applicable

3. Propose the right approach based on constraints:

n ≤ 10:     Brute force / backtracking with pruning
n ≤ 20:     Bitmask DP — O(2^n * poly(n))
n ≤ 40:     Meet in the middle — O(2^(n/2) * poly(n))
n ≤ 10^3:   Pseudo-polynomial DP (if applicable, e.g., knapsack with small W)
n ≤ 10^6:   Greedy approximation, heuristics

Practical Solutions for Interview Problems

Meet in the Middle — O(2^(n/2))

Idea: Split the input into two halves. Enumerate all 2^(n/2) subsets of each half. Combine results intelligently.

// Subset Sum with n up to 40
bool subsetSumMeetInMiddle(vector<int>& arr, long long target) {
    int n = arr.size();
    int half = n / 2;
    int rest = n - half;

    // Generate all subset sums of first half
    vector<long long> firstSums;
    for (int mask = 0; mask < (1 << half); mask++) {
        long long sum = 0;
        for (int i = 0; i < half; i++)
            if (mask & (1 << i)) sum += arr[i];
        firstSums.push_back(sum);
    }
    sort(firstSums.begin(), firstSums.end());

    // For each subset sum of second half, binary search in first half
    for (int mask = 0; mask < (1 << rest); mask++) {
        long long sum = 0;
        for (int i = 0; i < rest; i++)
            if (mask & (1 << i)) sum += arr[half + i];

        long long need = target - sum;
        if (binary_search(firstSums.begin(), firstSums.end(), need))
            return true;
    }
    return false;
}

Time: O(2^(n/2) * n) for generation + O(2^(n/2) * log(2^(n/2))) for search = O(n * 2^(n/2))

Backtracking with Pruning

For many NP-complete problems, smart pruning makes brute force feasible for moderate inputs.

// Maximum Clique via backtracking with pruning
int maxClique;

void findClique(vector<vector<bool>>& adj, vector<int>& clique,
                vector<int>& candidates, int n) {
    if (candidates.empty()) {
        maxClique = max(maxClique, (int)clique.size());
        return;
    }

    for (int i = 0; i < candidates.size(); i++) {
        // Pruning: even if we add all remaining candidates, can we beat best?
        if ((int)clique.size() + (int)candidates.size() - i <= maxClique)
            return; // bound: can't improve

        int v = candidates[i];

        // New candidates: only vertices adjacent to v
        vector<int> newCandidates;
        for (int j = i + 1; j < candidates.size(); j++) {
            if (adj[v][candidates[j]])
                newCandidates.push_back(candidates[j]);
        }

        clique.push_back(v);
        findClique(adj, clique, newCandidates, n);
        clique.pop_back();
    }
}

Branch and Bound (Knapsack)

struct Item { int weight, value; double ratio; };

int bestValue;

void knapsackBnB(vector<Item>& items, int idx, int currWeight,
                 int currValue, int W) {
    if (currWeight > W) return;
    bestValue = max(bestValue, currValue);
    if (idx == items.size()) return;

    // Upper bound: fractional relaxation of remaining items
    double bound = currValue;
    int remWeight = W - currWeight;
    for (int i = idx; i < items.size() && remWeight > 0; i++) {
        if (items[i].weight <= remWeight) {
            bound += items[i].value;
            remWeight -= items[i].weight;
        } else {
            bound += items[i].ratio * remWeight;
            break;
        }
    }

    if (bound <= bestValue) return; // pruning!

    // Take item idx
    knapsackBnB(items, idx + 1, currWeight + items[idx].weight,
                currValue + items[idx].value, W);
    // Skip item idx
    knapsackBnB(items, idx + 1, currWeight, currValue, W);
}

Summary: Decision Tree for NP-Complete Problems in Interviews

Key takeaway: The ability to recognize NP-complete problems and immediately pivot to the right approach (based on constraints) is what separates strong candidates from average ones. Never spend an interview trying to find a polynomial algorithm for an NP-complete problem.

Quick Reference: NP-Complete Problems Cheat Sheet

FPTAS = Fully Polynomial-Time Approximation Scheme: for any epsilon > 0, gets within (1+epsilon) factor in polynomial time.