Minimum Number of Vertices to Reach All Nodes
Problem
Given a directed acyclic graph with n nodes labeled from 0 to n-1 and a list of edges, return the smallest set of vertices from which all nodes in the graph are reachable.
- 1 ≤ n ≤ 10⁵
- 0 ≤ edges.length ≤ min(10⁵, n * (n - 1) / 2)
- edges[i].length == 2
- 0 ≤ from_i, to_i < n
- All pairs (from_i, to_i) are distinct.
Example
n = 6, edges = [[0,1],[0,2],[2,5],[3,4],[4,2]][0,3]The graph has edges from 0 to 1 and 2, from 2 to 5, from 3 to 4, and from 4 to 2. Nodes 0 and 3 have no incoming edges, so starting from these nodes, all other nodes can be reached. The minimal set of vertices to reach all nodes is therefore [0,3].
Approach
Straightforward Solution
A brute-force approach would attempt to start a traversal from every node and check if all nodes are reachable, which is inefficient (O(n*(n+e)) time). This approach is impractical for large graphs.
Core Observation
In a directed acyclic graph, nodes with zero indegree (no incoming edges) cannot be reached from any other node. Therefore, these nodes must be included in the minimal set of vertices to ensure reachability of the entire graph.
Path to Optimal
The key insight is that nodes with incoming edges can be reached from other nodes, so only nodes with zero indegree need to be included in the starting set. By counting indegrees for all nodes, the algorithm identifies these zero-indegree nodes efficiently in O(n + e) time.
Optimal Approach
PreviewInitialize an indegree array of size n with zeros. Iterate over all edges, incrementing the indegree of the destination node…
Full step-by-step walkthrough on Pro →
Want the full reasoning chain?
Unlock the complete walkthrough, line-by-line analysis, and recall drill.
Unlock ProTime
O(n + e)
The algorithm initializes an indegree array of size n and iterates once over all edges to compute indegrees, resulting in linear time relative to nodes and edges.
Space
O(n)
The indegree array requires O(n) auxiliary space to store the count of incoming edges for each node. The output list is at most size n but is not counted as auxiliary space.
Pattern Spotlight
Graph Traversal (Indegree Analysis for Reachability)
In a DAG, nodes with zero indegree are the unique minimal set of starting points needed to reach all nodes, because every other node has at least one incoming edge and is reachable from these zero-indegree nodes.
Solution
| 1 | class Solution:
|
| 2 | def findSmallestSetOfVertices(self, n: int, edges: List[List[int]]) -> List[int]:
|
| 3 | indegree = [0] * n
|
| 4 | for _, y in edges:
|
| 5 | indegree[y] += 1
|
| 6 |
|
| 7 | return [node for node in range(n) if indegree[node] == 0] |
Step-by-Step Solution
Count Incoming Edges for Each Node
| 3 | indegree = [0] * n
|
| 4 | for _, y in edges:
|
| 5 | indegree[y] += 1
|
Objective
To compute the indegree of every node by iterating over all edges and incrementing the count for destination nodes.
Key Insight
Indegree counting identifies nodes that have no incoming edges, which are the only candidates for the minimal set of starting vertices. This step transforms the graph into a simple array of counts, enabling O(1) checks for zero indegree nodes.
Interview Quick-Check
Core Logic
Incrementing the indegree for each destination node captures the exact number of incoming edges, which is critical for identifying zero indegree nodes.
State & Boundaries
The loop iterates over all edges exactly once, ensuring all incoming edges are counted.
Complexity
This step runs in O(e) time, where e is the number of edges.
Collect Nodes with Zero Indegree as Minimal Starting Set
To identify and return all nodes that have zero incoming edges, forming the minimal set of vertices to reach the entire graph.
1 more step with full analysis available on Pro.
Line Analysis
This solution has 2 Critical lines interviewers watch for.
return [node for node in range(n) if indegree[node] == 0]
Return a list of all nodes with zero indegree.
Nodes with zero indegree have no incoming edges and must be included in the minimal set to ensure reachability of all nodes; this line efficiently filters and returns them.
indegree[y] += 1
Increment the indegree count for the destination node of the current edge.
Incrementing the indegree for each destination node captures the exact number of incoming edges, which is critical for identifying zero indegree nodes.
Full line-by-line criticality + rationale for all 4 lines available on Pro.
Test Your Understanding
Why is it sufficient to include only nodes with zero indegree in the minimal set of vertices to reach all nodes?
See the answer with Pro.
Related Problems
Topological Sort pattern
Don't just read it. Drill it.
Reconstruct Minimum Number of Vertices to Reach All Nodes from memory until it sticks. AlgoDrill blanks out key lines and makes you fill them back in, step by step.
Unlock the Minimum Number of Vertices to Reach All Nodes drill