Before we walk through the algorithm, let's see a few fundamental concepts first.
- Source: any node has no incoming edge(with indegree as 0)
- Sink: any node that has only incoming edge(with outdegree as 0)
Then we can say topological sorting starts with one of the sources and ends at one of the sinks.
Now let's see how the algorithm works.
- Initialize 4 variables
- an adjacency list to represent the whole graph
- a map to store all node's indegree
- an array to store all sources(init it with all node's with 0 indegree)
- an array to store all sorted nodes
- Loop through the sources array, for every source
- add it to sorted array
- get all of its children from the adjacency list
- decrement the indegree of each child by 1
- if a child's indegree becomes 0, then add it to the sources array
The indegree of node can be seen as node's number of dependencies. So for every loop, node's dependencies are decremented by 1. When its become 0, which means the node becomes a source and that is the right time to add it to the result sorted array.
From this, we can understand an important feature of topological sorting, which is that we can use it to detect if the graph is a Directed Acyclic Graph(DAG). Because if the graph has a cycle, then some nodes will have cyclic dependencies which make them impossible become 0 dependencies and thus impossible to be added to the result sorted array.
Now let's see an example.
Say we have a graph with 7 nodes with adjacency list as below.
0 => []
1 => []
2 => []
3 => [0, 1, 2]
4 => [1]
5 => [3, 4]
6 => [4, 2]
Let's see the process.
- init result sorted array []
- init indegree map {0:1, 1:2, 2:2, 3:1, 4:2, 5:0, 6:0}
- get all node with degree 0, add them to sorted array [5, 6]
- get [5, 6]'s children, [3, 4], [4, 2]
- decrement indegree by 1, {0:1, 1:2, 2:1, 3:0, 4:0, 5:0, 6:0}
- get new node [3, 4] with indegree 0, add them to sorted array [5, 6, 3, 4]
- get [3, 4]'s children [0, 1, 2], [1]
- decremnet indegree by 1, {0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0}
- get new node [0,1,2] with indegree 0, add them to sorted array [5, 6, 3, 4, 0, 1, 2]
- get [0,1,2]'s children [],[],[], no children, return
Now let's use it to solve the leecode Course Schedule problem.
We know if the graph has no cycle, after topological sort, we can get a result array with all nodes in it. If the graph do has cycles, them some node's indegree will never be 0, so the result array should unable to include all nodes.
We can use this feature to solve the problem. The process is simple, do a topological first and them compare the result length. Let's see some code.
function canFinish(numCourses: number, prerequisites: number[][]): boolean {
// typical topological sort should have an array to store the result
// but in this problem, we only use the length of the result
// so a number variable is enough
let count = 0;
const al = Array.from({ length: numCourses }).map(() => new Array<number>());
const inDegrees = Array.from({ length: numCourses }).map(() => 0);
for (const prerequisite of prerequisites) {
al[prerequisite[0]].push(prerequisite[1]);
inDegrees[prerequisite[1]] += 1;
}
const sources: number[] = [];
inDegrees.forEach((degree, i) => {
if (degree === 0) {
sources.push(i);
}
});
while (true) {
const node = sources.shift();
if (node === undefined) break;
count += 1;
for (const n of al[node]) {
inDegrees[n] -= 1;
if (inDegrees[n] === 0) {
sources.push(n);
}
}
}
return count === numCourses;
};