NP-Complete Problems (TSP, Knapsack, etc.)
NP-Complete Problems
Interview Significance: Recognizing NP-complete problems is critical. If you spend 30 minutes trying to find a polynomial algorithm for a problem that is NP-complete, you have wasted your interview. Instead, you should immediately pivot to approximation algorithms, exponential solutions with pruning, or pseudo-polynomial DP.
Complexity Classes — The Big Picture
P (Polynomial Time)
Problems solvable by a deterministic Turing machine in polynomial time: O(n), O(n^2), O(n^3), O(n log n), etc.
Examples: sorting, shortest path (Dijkstra), minimum spanning tree (Kruskal/Prim), bipartite matching, linear programming, maximum flow.
These are the "easy" problems — we have efficient algorithms for them.
NP (Nondeterministic Polynomial Time)
Problems where a given solution (called a "certificate") can be verified in polynomial time.
Key insight: Every problem in P is also in NP. If you can solve a problem in polynomial time, you can certainly verify a solution in polynomial time (just solve it and compare).
The billion-dollar question: Is P = NP? Most computer scientists believe P != NP, but nobody has proven it. This is one of the seven Millennium Prize Problems ($1,000,000 bounty).
Intuition: Think of a Sudoku puzzle. It's hard to solve (you have to search), but if someone gives you a completed grid, you can quickly verify it's correct by checking rows, columns, and boxes. That's NP in a nutshell.
NP-Hard
At least as hard as the hardest problems in NP. Formally: a problem H is NP-hard if every problem in NP can be reduced to H in polynomial time.
- NP-hard problems are not necessarily in NP (they might not even be decision problems)
- Example: the halting problem is NP-hard but not in NP (it's undecidable)
NP-Complete
A problem that is both NP and NP-hard. These are the "hardest" problems within NP.
Why the diagram matters: If P = NP, the three inner regions collapse into one. If P != NP (the consensus), then P is strictly inside NP, and NP-Complete problems sit at the boundary of NP and NP-Hard.
Why Does This Matter for Interviews?
You need to recognize NP-complete problems because:
- You won't find a polynomial-time exact solution — don't waste time trying
- You should propose practical approaches: approximation algorithms, exponential solutions with good pruning, or pseudo-polynomial DP
- Interviewers may test if you recognize the problem class — saying "this reduces to Set Cover, which is NP-complete" demonstrates strong CS fundamentals
- Constraint analysis becomes critical: n <= 20 suggests bitmask DP, n <= 40 suggests meet-in-the-middle, large n suggests greedy/approximation
How to Prove a Problem is NP-Complete
The standard two-step process:
Step 1: Show the problem is in NP
Demonstrate that given a candidate solution (certificate), you can verify it in polynomial time.
Step 2: Show the problem is NP-Hard via reduction
Reduce a known NP-complete problem to your problem in polynomial time.
Reductions — The Key Concept
The Chain of Reductions
Cook-Levin theorem proved SAT is NP-complete. Then reductions built the chain:
Each arrow means "reduces to." Every new NP-complete problem was proven by reducing a known one to it.
The Famous NP-Complete Problems
1. Boolean Satisfiability (SAT) — The First NP-Complete Problem
Problem: Given a boolean formula in conjunctive normal form (CNF), is there an assignment of variables that makes it TRUE?
CNF: A conjunction (AND) of clauses, where each clause is a disjunction (OR) of literals.
Formula: (x₁ ∨ ¬x₂) ∧ (¬x₁ ∨ x₃) ∧ (x₂ ∨ ¬x₃)
Variables: x₁, x₂, x₃ ∈ {TRUE, FALSE}
Try x₁=T, x₂=T, x₃=T:
Clause 1: (T ∨ F) = T ✓
Clause 2: (F ∨ T) = T ✓
Clause 3: (T ∨ F) = T ✓
Result: T ∧ T ∧ T = TRUE → SATISFIABLE!
Variants:
- 2-SAT: Each clause has exactly 2 literals → solvable in polynomial time (implication graph + SCC)
- 3-SAT: Each clause has exactly 3 literals → NP-complete
- k-SAT (k >= 3): NP-complete
- Horn-SAT: At most one positive literal per clause → polynomial time
Why SAT matters: The Cook-Levin theorem (1971) proved SAT was the first NP-complete problem. Every other NP-complete proof ultimately chains back to SAT.
2-SAT: The Polynomial Solvable Case
Since 2-SAT comes up in competitive programming, here's the solution:
Idea: Each clause (a ∨ b) is equivalent to two implications:
(¬a → b) and (¬b → a)
Build an implication graph. If x and ¬x are in the same SCC,
the formula is unsatisfiable.
// 2-SAT solver
// Variables 0..n-1. Literal x is 2*x, literal ¬x is 2*x+1.
struct TwoSAT {
int n;
vector<vector<int>> adj, radj;
vector<int> order, comp;
vector<bool> vis;
TwoSAT(int n) : n(n), adj(2*n), radj(2*n), comp(2*n), vis(2*n) {}
// Add clause (a ∨ b). Pass literals, not variables.
// For variable x: true literal = 2*x, false literal = 2*x+1
void addClause(int a, int b) {
adj[a ^ 1].push_back(b); // ¬a → b
adj[b ^ 1].push_back(a); // ¬b → a
radj[b].push_back(a ^ 1);
radj[a].push_back(b ^ 1);
}
void dfs1(int u) {
vis[u] = true;
for (int v : adj[u]) if (!vis[v]) dfs1(v);
order.push_back(u);
}
void dfs2(int u, int c) {
comp[u] = c;
for (int v : radj[u]) if (comp[v] == -1) dfs2(v, c);
}
// Returns true if satisfiable, fills `values` with assignment
bool solve(vector<bool>& values) {
fill(vis.begin(), vis.end(), false);
for (int i = 0; i < 2*n; i++) if (!vis[i]) dfs1(i);
fill(comp.begin(), comp.end(), -1);
int c = 0;
for (int i = 2*n - 1; i >= 0; i--)
if (comp[order[i]] == -1) dfs2(order[i], c++);
values.resize(n);
for (int i = 0; i < n; i++) {
if (comp[2*i] == comp[2*i+1]) return false; // x and ¬x in same SCC
values[i] = comp[2*i] > comp[2*i+1];
}
return true;
}
};
Time complexity: O(V + E) — linear in the size of the formula.
2. Traveling Salesman Problem (TSP)
Problem: Given n cities and distances between every pair of cities, find the shortest tour that visits every city exactly once and returns to the starting city.
Cities and distances:
A ───10─── B
|\ |
| \ |
20 15 35
| \ |
| \ |
C ──25─── D
Distance matrix:
A B C D
A [ 0 10 20 15 ]
B [ 10 0 35 25 ]
C [ 20 35 0 30 ]
D [ 15 25 30 0 ]
All possible tours (starting from A):
A → B → C → D → A : 10 + 35 + 30 + 15 = 90
A → B → D → C → A : 10 + 25 + 30 + 20 = 85 ← optimal
A → C → B → D → A : 20 + 35 + 25 + 15 = 95
A → C → D → B → A : 20 + 30 + 25 + 10 = 85 ← optimal (reverse)
A → D → B → C → A : 15 + 25 + 35 + 20 = 95
A → D → C → B → A : 15 + 30 + 35 + 10 = 90
Best tour: cost 85
Brute Force: O(n!)
Try all (n-1)!/2 distinct tours. Completely impractical for n > 12.
DP with Bitmask: O(n^2 * 2^n) — The Standard Interview Approach
Key idea: Use a bitmask to represent the set of visited cities. dp[mask][i] = minimum cost to visit exactly the cities in mask, ending at city i.
State: dp[mask][i]
- mask: bitmask of visited cities (bit j set means city j visited)
- i: current city (last city visited)
- value: minimum cost to reach this state
Transition: dp[mask | (1 << v)][v] = min(dp[mask][u] + dist[u][v])
for all unvisited cities v
Base case: dp[1][0] = 0 (start at city 0, only city 0 visited)
Answer: min over all i of (dp[ALL][i] + dist[i][0])
Visual trace for 4 cities (A=0, B=1, C=2, D=3):
Step 0: dp[0001][0] = 0 (at A, visited {A})
Step 1: From dp[0001][0]:
→ dp[0011][1] = 0 + 10 = 10 (at B, visited {A,B})
→ dp[0101][2] = 0 + 20 = 20 (at C, visited {A,C})
→ dp[1001][3] = 0 + 15 = 15 (at D, visited {A,D})
Step 2: From dp[0011][1]:
→ dp[0111][2] = 10 + 35 = 45 (at C, visited {A,B,C})
→ dp[1011][3] = 10 + 25 = 35 (at D, visited {A,B,D})
From dp[0101][2]:
→ dp[0111][1] = 20 + 35 = 55 (worse than 45, skip)
→ dp[1101][3] = 20 + 30 = 50 (at D, visited {A,C,D})
From dp[1001][3]:
→ dp[1011][1] = 15 + 25 = 40 (worse than 35, skip)
→ dp[1101][2] = 15 + 30 = 45 (better than 50? no, 45 < 50, update!)
Step 3: Final step — visit all 4 cities:
dp[1111][1] = dp[1101][2] + dist[2][1] = 45 + 35 = 80?
dp[1001][3] → ... (trace all paths)
Eventually: dp[1111][i] for each i, then add dist[i][0] to return.
int tsp(vector<vector<int>>& dist) {
int n = dist.size();
int ALL = (1 << n) - 1;
// dp[mask][i] = min cost to visit cities in mask, ending at city i
vector<vector<int>> dp(1 << n, vector<int>(n, INT_MAX));
dp[1][0] = 0; // start at city 0
for (int mask = 1; mask < (1 << n); mask++) {
for (int u = 0; u < n; u++) {
if (dp[mask][u] == INT_MAX) continue;
if (!(mask & (1 << u))) continue; // u must be in mask
for (int v = 0; v < n; v++) {
if (mask & (1 << v)) continue; // v already visited
int newMask = mask | (1 << v);
dp[newMask][v] = min(dp[newMask][v],
dp[mask][u] + dist[u][v]);
}
}
}
// Find min cost tour: visit all cities and return to start
int ans = INT_MAX;
for (int u = 0; u < n; u++) {
if (dp[ALL][u] != INT_MAX) {
ans = min(ans, dp[ALL][u] + dist[u][0]);
}
}
return ans;
}
Complexity:
- Time: O(n^2 * 2^n) — for each of 2^n masks and n cities, we try n transitions
- Space: O(n * 2^n) — the DP table
- Practical limit: n ~ 20 (2^20 * 20^2 = ~400 million operations)
TSP Approximation Algorithms
For metric TSP (triangle inequality holds: dist(a,c) <= dist(a,b) + dist(b,c)):
1. MST-based 2-approximation:
1. Build MST of the complete graph
2. Double every edge (creates an Eulerian graph)
3. Find Eulerian tour
4. Shortcut repeated vertices
MST: Double edges: Shortcut:
A─B A═B A─B
| ║ \
C─D C═D C─D─A
Cost ≤ 2 × optimal TSP
2. Christofides algorithm: 1.5-approximation
1. Build MST
2. Find minimum weight perfect matching on odd-degree vertices
3. Combine MST + matching (Eulerian graph)
4. Find Eulerian tour and shortcut
3. Nearest neighbor heuristic (greedy, no guarantee):
int nearestNeighborTSP(vector<vector<int>>& dist) {
int n = dist.size();
vector<bool> visited(n, false);
int curr = 0, cost = 0;
visited[0] = true;
for (int step = 1; step < n; step++) {
int best = -1, bestDist = INT_MAX;
for (int j = 0; j < n; j++) {
if (!visited[j] && dist[curr][j] < bestDist) {
bestDist = dist[curr][j];
best = j;
}
}
visited[best] = true;
cost += bestDist;
curr = best;
}
cost += dist[curr][0]; // return to start
return cost;
}
3. Knapsack Problem
Problem: Given n items, each with weight w_i and value v_i, and a knapsack with capacity W, maximize the total value of items that fit.
Why Knapsack is NP-Complete (Despite Having a "Polynomial" DP)
The DP runs in O(nW), which looks polynomial. But W is a number, not a count. The input size is log₂(W) bits to represent W. So O(nW) = O(n * 2^(bits of W)), which is exponential in the input size. This is called pseudo-polynomial time.
W = 1,000,000,000 (10 digits, ~30 bits)
n = 100 items
DP table: 100 × 1,000,000,000 = 100 billion entries → IMPOSSIBLE
But if W = 1000:
DP table: 100 × 1000 = 100,000 entries → EASY
0/1 Knapsack — Bottom Up DP
dp[w] = maximum value achievable with weight capacity w
For each item i (from 0 to n-1):
For w from W down to wt[i]: ← reverse order prevents using item twice!
dp[w] = max(dp[w], dp[w - wt[i]] + val[i])
// 0/1 Knapsack — O(nW) time, O(W) space
int knapsack01(vector<int>& wt, vector<int>& val, int W) {
int n = wt.size();
vector<int> dp(W + 1, 0);
for (int i = 0; i < n; i++) {
// Iterate backwards to ensure each item used at most once
for (int w = W; w >= wt[i]; w--) {
dp[w] = max(dp[w], dp[w - wt[i]] + val[i]);
}
}
return dp[W];
}
Why iterate backwards? If we go forward, dp[w - wt[i]] might already include item i from this iteration, effectively using item i multiple times.
Forward (WRONG for 0/1):
Item: weight=3, value=4
dp[3] = max(dp[3], dp[0]+4) = 4 ← uses item once
dp[6] = max(dp[6], dp[3]+4) = 8 ← uses same item AGAIN!
Backward (CORRECT):
dp[6] = max(dp[6], dp[3]+4) ← dp[3] is still from previous iteration
dp[3] = max(dp[3], dp[0]+4) ← only now we update dp[3]
Unbounded Knapsack
Each item can be used unlimited times. Just iterate forward instead of backward:
int knapsackUnbounded(vector<int>& wt, vector<int>& val, int W) {
int n = wt.size();
vector<int> dp(W + 1, 0);
for (int w = 0; w <= W; w++) {
for (int i = 0; i < n; i++) {
if (wt[i] <= w) {
dp[w] = max(dp[w], dp[w - wt[i]] + val[i]);
}
}
}
return dp[W];
}
Knapsack with Item Recovery (Which Items to Take)
pair<int, vector<int>> knapsackWithItems(vector<int>& wt, vector<int>& val, int W) {
int n = wt.size();
vector<vector<int>> dp(n + 1, vector<int>(W + 1, 0));
for (int i = 1; i <= n; i++) {
for (int w = 0; w <= W; w++) {
dp[i][w] = dp[i-1][w]; // don't take item i
if (wt[i-1] <= w) {
dp[i][w] = max(dp[i][w], dp[i-1][w - wt[i-1]] + val[i-1]);
}
}
}
// Backtrack to find which items were taken
vector<int> items;
int w = W;
for (int i = n; i >= 1; i--) {
if (dp[i][w] != dp[i-1][w]) {
items.push_back(i - 1); // took item i-1 (0-indexed)
w -= wt[i-1];
}
}
return {dp[n][W], items};
}
4. Graph Coloring
Problem: Color the vertices of a graph with at most k colors such that no two adjacent vertices share the same color.
Graph: 2-colorable (bipartite)?
A ─── B A(R) ─── B(B)
│ │ │ │
│ │ │ │
C ─── D C(B) ─── D(R)
✓ Yes! This is bipartite.
Add diagonal A─D: A(R) ─── B(B)
A ─── B │ \ │
│ \ │ │ \ │
│ \ │ │ \ │
C ─── D C(B) ─── D(?)
D is adjacent to A(R), B(B), C(B)
D can be R? No, A is R.
Need 3 colors: D(G)
Complexity:
- k=1: Trivial — only if graph has no edges
- k=2: Bipartite check — BFS/DFS, O(V+E)
- k=3: NP-complete
- k>=3: NP-complete
Bipartite Check (2-coloring)
bool isBipartite(vector<vector<int>>& adj) {
int n = adj.size();
vector<int> color(n, -1);
for (int start = 0; start < n; start++) {
if (color[start] != -1) continue;
queue<int> q;
q.push(start);
color[start] = 0;
while (!q.empty()) {
int u = q.front(); q.pop();
for (int v : adj[u]) {
if (color[v] == -1) {
color[v] = color[u] ^ 1;
q.push(v);
} else if (color[v] == color[u]) {
return false; // odd cycle found
}
}
}
}
return true;
}
Graph Coloring via Backtracking (for small graphs)
bool canColor(vector<vector<int>>& adj, vector<int>& colors, int u, int k) {
int n = adj.size();
if (u == n) return true; // all vertices colored
for (int c = 0; c < k; c++) {
// Check if color c is valid for vertex u
bool valid = true;
for (int v : adj[u]) {
if (colors[v] == c) { valid = false; break; }
}
if (valid) {
colors[u] = c;
if (canColor(adj, colors, u + 1, k)) return true;
colors[u] = -1;
}
}
return false;
}
int chromaticNumber(vector<vector<int>>& adj) {
int n = adj.size();
vector<int> colors(n, -1);
for (int k = 1; k <= n; k++) {
fill(colors.begin(), colors.end(), -1);
if (canColor(adj, colors, 0, k)) return k;
}
return n; // worst case: every vertex gets unique color
}
5. Subset Sum
Problem: Given a set of integers S and a target T, is there a subset of S that sums to exactly T?
S = {3, 7, 1, 8, 4}, Target = 11
Subsets that sum to 11:
{3, 8} = 11 ✓
{7, 4} = 11 ✓
{3, 1, 4} = 8 ✗
{3, 7, 1} = 11 ✓
DP Solution: O(n * T) Pseudo-Polynomial
dp[j] = true if we can form sum j using some subset of items seen so far
For each item s in S:
For j from T down to s:
dp[j] = dp[j] || dp[j - s]
Base case: dp[0] = true (empty subset sums to 0)
bool subsetSum(vector<int>& arr, int target) {
vector<bool> dp(target + 1, false);
dp[0] = true;
for (int x : arr) {
for (int j = target; j >= x; j--) {
dp[j] = dp[j] || dp[j - x];
}
}
return dp[target];
}
Bitset Optimization: O(n * T / 64)
bool subsetSumBitset(vector<int>& arr, int target) {
bitset<100001> dp;
dp[0] = 1;
for (int x : arr) {
dp |= (dp << x); // shift left by x and OR
}
return dp[target];
}
Intuition: The bitset represents all achievable sums. Shifting left by x adds x to every achievable sum. OR-ing with the old set keeps previously achievable sums.
Partition Problem (Special Case of Subset Sum)
Problem: Can we partition an array into two subsets with equal sum?
bool canPartition(vector<int>& nums) {
int sum = 0;
for (int x : nums) sum += x;
if (sum % 2 != 0) return false;
int target = sum / 2;
return subsetSum(nums, target);
}
6. Vertex Cover
Problem: Find the minimum set of vertices such that every edge has at least one endpoint in the set.
Graph: Vertex Cover = {B, C}
A ─── B ─── B B covers edges AB, BD, BE
│ │\ │\
│ │ \ │ \
C ─── D E C ── D E C covers edges AC, CD
Every edge is "covered" by at least one vertex in {B, C}.
Is this minimum?
Try size 1: No single vertex covers all edges.
So minimum vertex cover has size 2.
Facts:
- NP-complete for general graphs
- Polynomial for bipartite graphs (Konig's theorem: min vertex cover = max matching)
- 2-approximation exists: greedily pick both endpoints of uncovered edges
2-Approximation for Vertex Cover
vector<int> approxVertexCover(int n, vector<pair<int,int>>& edges) {
vector<bool> inCover(n, false);
vector<bool> edgeCovered(edges.size(), false);
for (int i = 0; i < edges.size(); i++) {
if (!edgeCovered[i]) {
auto [u, v] = edges[i];
if (!inCover[u] && !inCover[v]) {
inCover[u] = inCover[v] = true;
// Mark all edges incident to u or v as covered
for (int j = i; j < edges.size(); j++) {
if (edges[j].first == u || edges[j].first == v ||
edges[j].second == u || edges[j].second == v) {
edgeCovered[j] = true;
}
}
}
}
}
vector<int> cover;
for (int i = 0; i < n; i++)
if (inCover[i]) cover.push_back(i);
return cover;
}
Why 2-approximation? Every edge we pick has both endpoints added, but at least one must be in the optimal solution. So we use at most 2x the optimal.
7. Hamiltonian Path/Cycle
Problem: Find a path (or cycle) that visits every vertex exactly once.
Hamiltonian Path via Bitmask DP: O(n^2 * 2^n)
// Does a Hamiltonian path exist? (visit all vertices exactly once)
bool hamiltonianPath(vector<vector<int>>& adj) {
int n = adj.size();
// dp[mask][i] = can we visit exactly the vertices in mask, ending at i?
vector<vector<bool>> dp(1 << n, vector<bool>(n, false));
// Base: single vertex
for (int i = 0; i < n; i++)
dp[1 << i][i] = true;
for (int mask = 1; mask < (1 << n); mask++) {
for (int u = 0; u < n; u++) {
if (!dp[mask][u]) continue;
if (!(mask & (1 << u))) continue;
for (int v : adj[u]) {
if (mask & (1 << v)) continue;
dp[mask | (1 << v)][v] = true;
}
}
}
int ALL = (1 << n) - 1;
for (int i = 0; i < n; i++)
if (dp[ALL][i]) return true;
return false;
}
8. Clique Problem
Problem: Find the largest complete subgraph (every pair of vertices in the subgraph is connected by an edge).
Graph: Largest clique: {A, B, D} (size 3)
A ─── B
│\ /│ A─B ✓
│ \/ │ A─D ✓
│ /\ │ B─D ✓
│/ \ │
D ─── C {A,B,C,D} is NOT a clique (A─C missing)
Adjacency:
A: B, C, D (wait, is A─C present? Let me redraw)
Actually, let's use:
A─B, A─D, B─C, B─D, C─D
A ── B Clique {B, C, D}: B─C ✓, B─D ✓, C─D ✓
│\ Size 3 is maximum.
│ \
A C──D
| |
└────┘ (no, A─C not present)
Relationship to other problems:
- Clique in graph G = Independent Set in complement of G
- Independent Set = Vertex Cover complement (V \ vertex_cover = independent_set)
Bron-Kerbosch Algorithm (Finding All Maximal Cliques)
// Bron-Kerbosch with pivot — standard algorithm for maximal cliques
void bronKerbosch(vector<vector<bool>>& adj, vector<int>& R,
vector<int>& P, vector<int>& X,
vector<vector<int>>& cliques) {
if (P.empty() && X.empty()) {
cliques.push_back(R); // R is a maximal clique
return;
}
// Choose pivot vertex from P ∪ X that maximizes |P ∩ N(pivot)|
int pivot = -1, maxDeg = -1;
for (int u : P) {
int deg = 0;
for (int v : P) if (adj[u][v]) deg++;
if (deg > maxDeg) { maxDeg = deg; pivot = u; }
}
for (int u : X) {
int deg = 0;
for (int v : P) if (adj[u][v]) deg++;
if (deg > maxDeg) { maxDeg = deg; pivot = u; }
}
// Iterate over P \ N(pivot)
vector<int> candidates;
for (int v : P)
if (pivot == -1 || !adj[pivot][v]) candidates.push_back(v);
for (int v : candidates) {
// New P: P ∩ N(v)
vector<int> newP, newX;
for (int u : P) if (adj[v][u]) newP.push_back(u);
for (int u : X) if (adj[v][u]) newX.push_back(u);
R.push_back(v);
bronKerbosch(adj, R, newP, newX, cliques);
R.pop_back();
P.erase(find(P.begin(), P.end(), v));
X.push_back(v);
}
}
9. Set Cover
Problem: Given a universe U and a collection of sets S₁, S₂, ..., Sₘ, find the minimum number of sets whose union equals U.
Universe U = {1, 2, 3, 4, 5, 6, 7}
Sets:
S₁ = {1, 2, 3}
S₂ = {2, 4}
S₃ = {3, 4, 5}
S₄ = {4, 5, 6, 7}
S₅ = {6, 7}
Greedy: pick the set covering the most uncovered elements
Pick S₄ = {4,5,6,7} → uncovered = {1,2,3}
Pick S₁ = {1,2,3} → uncovered = {} DONE!
Answer: 2 sets (S₁, S₄) — this happens to be optimal here.
Greedy Set Cover: O(ln(n))-approximation
vector<int> greedySetCover(int n, vector<vector<int>>& sets) {
vector<bool> covered(n + 1, false);
int uncovered = n;
vector<int> chosen;
vector<bool> used(sets.size(), false);
while (uncovered > 0) {
// Find the set that covers the most uncovered elements
int best = -1, bestCount = 0;
for (int i = 0; i < sets.size(); i++) {
if (used[i]) continue;
int count = 0;
for (int elem : sets[i])
if (!covered[elem]) count++;
if (count > bestCount) {
bestCount = count;
best = i;
}
}
if (best == -1) break; // can't cover more
used[best] = true;
chosen.push_back(best);
for (int elem : sets[best]) {
if (!covered[elem]) {
covered[elem] = true;
uncovered--;
}
}
}
return chosen;
}
Approximation ratio: ln(n) + 1 where n = |U|. This is the best possible unless P = NP.
Recognizing NP-Complete Problems in Disguise
The interviewer will never say "solve this NP-complete problem." They disguise it:
Red Flags That Suggest NP-Completeness
- The problem asks for an optimal solution over all subsets or all permutations
- The problem has a "select or don't select" structure for each element
- Small constraints (n ≤ 20) — the problem setter knows it's exponential
- The problem seems like it should be easy but you can't find a polynomial solution
- There's no obvious greedy strategy that works, and DP states blow up
What to Do in an Interview When You Recognize NP-Completeness
Step-by-step response strategy:
1. State it clearly:
"This problem is a variant of [known NP-complete problem], which means there's no known polynomial-time algorithm."
2. Ask about constraints:
- n ≤ 15-20 → bitmask DP, O(2^n * n) or O(2^n * n^2)
- n ≤ 40 → meet-in-the-middle, O(2^(n/2) * n)
- n is large → greedy approximation, or pseudo-polynomial DP if applicable
3. Propose the right approach based on constraints:
n ≤ 10: Brute force / backtracking with pruning
n ≤ 20: Bitmask DP — O(2^n * poly(n))
n ≤ 40: Meet in the middle — O(2^(n/2) * poly(n))
n ≤ 10^3: Pseudo-polynomial DP (if applicable, e.g., knapsack with small W)
n ≤ 10^6: Greedy approximation, heuristics
Practical Solutions for Interview Problems
Meet in the Middle — O(2^(n/2))
Idea: Split the input into two halves. Enumerate all 2^(n/2) subsets of each half. Combine results intelligently.
// Subset Sum with n up to 40
bool subsetSumMeetInMiddle(vector<int>& arr, long long target) {
int n = arr.size();
int half = n / 2;
int rest = n - half;
// Generate all subset sums of first half
vector<long long> firstSums;
for (int mask = 0; mask < (1 << half); mask++) {
long long sum = 0;
for (int i = 0; i < half; i++)
if (mask & (1 << i)) sum += arr[i];
firstSums.push_back(sum);
}
sort(firstSums.begin(), firstSums.end());
// For each subset sum of second half, binary search in first half
for (int mask = 0; mask < (1 << rest); mask++) {
long long sum = 0;
for (int i = 0; i < rest; i++)
if (mask & (1 << i)) sum += arr[half + i];
long long need = target - sum;
if (binary_search(firstSums.begin(), firstSums.end(), need))
return true;
}
return false;
}
Time: O(2^(n/2) * n) for generation + O(2^(n/2) * log(2^(n/2))) for search = O(n * 2^(n/2))
Backtracking with Pruning
For many NP-complete problems, smart pruning makes brute force feasible for moderate inputs.
// Maximum Clique via backtracking with pruning
int maxClique;
void findClique(vector<vector<bool>>& adj, vector<int>& clique,
vector<int>& candidates, int n) {
if (candidates.empty()) {
maxClique = max(maxClique, (int)clique.size());
return;
}
for (int i = 0; i < candidates.size(); i++) {
// Pruning: even if we add all remaining candidates, can we beat best?
if ((int)clique.size() + (int)candidates.size() - i <= maxClique)
return; // bound: can't improve
int v = candidates[i];
// New candidates: only vertices adjacent to v
vector<int> newCandidates;
for (int j = i + 1; j < candidates.size(); j++) {
if (adj[v][candidates[j]])
newCandidates.push_back(candidates[j]);
}
clique.push_back(v);
findClique(adj, clique, newCandidates, n);
clique.pop_back();
}
}
Branch and Bound (Knapsack)
struct Item { int weight, value; double ratio; };
int bestValue;
void knapsackBnB(vector<Item>& items, int idx, int currWeight,
int currValue, int W) {
if (currWeight > W) return;
bestValue = max(bestValue, currValue);
if (idx == items.size()) return;
// Upper bound: fractional relaxation of remaining items
double bound = currValue;
int remWeight = W - currWeight;
for (int i = idx; i < items.size() && remWeight > 0; i++) {
if (items[i].weight <= remWeight) {
bound += items[i].value;
remWeight -= items[i].weight;
} else {
bound += items[i].ratio * remWeight;
break;
}
}
if (bound <= bestValue) return; // pruning!
// Take item idx
knapsackBnB(items, idx + 1, currWeight + items[idx].weight,
currValue + items[idx].value, W);
// Skip item idx
knapsackBnB(items, idx + 1, currWeight, currValue, W);
}
Summary: Decision Tree for NP-Complete Problems in Interviews
Key takeaway: The ability to recognize NP-complete problems and immediately pivot to the right approach (based on constraints) is what separates strong candidates from average ones. Never spend an interview trying to find a polynomial algorithm for an NP-complete problem.
Quick Reference: NP-Complete Problems Cheat Sheet
FPTAS = Fully Polynomial-Time Approximation Scheme: for any epsilon > 0, gets within (1+epsilon) factor in polynomial time.