Canonical Deep-Dive: Course Schedule
Problem: Course Schedule (https://leetcode.com/problems/course-schedule/) The lesson: a directed graph can be ordered if and only if it has no cycle, and there are two clean ways to detect that cycle.
This is a guided walk-through, not an answer key. It explains how to think about topological sort and cycle detection, and shows you two algorithms as relations and procedures in words. It stops short of an assembled, paste-runnable function on purpose. You write the actual Python yourself, in your own work repo. The tutor will not confirm your code; LeetCode’s judge will.
Restated: there are numCourses courses labeled 0 to numCourses - 1. You are given a list of prerequisite pairs; the pair [a, b] means you must take course b before course a. Return True if you can finish every course, and False otherwise. You can finish every course exactly when there is some order that respects all the prerequisites, which is possible if and only if the prerequisite graph has no cycle. So the whole problem reduces to one question: does this directed graph contain a cycle?
Step zero: see the graph
Each course is a node. The pair [a, b] (“b before a”) is a directed edge from b to a: an arrow pointing from the prerequisite to the course that depends on it. With that convention, a valid schedule is a topological order, a linear arrangement of the nodes where every edge points forward. Such an order exists exactly when the graph is a DAG (a directed acyclic graph). If there is a cycle (course X needs Y, Y needs Z, Z needs X), no order can put all three “before” each other, and you can never start.
Build the graph as an adjacency list first: a defaultdict(list) mapping each node to the courses that list it as a prerequisite. Both algorithms below start from that same adjacency list.
Concrete example. numCourses = 3, prerequisites [[1, 0], [2, 1]] means edges 0 -> 1 and 1 -> 2. No cycle, so the answer is True (take 0, then 1, then 2). Now add [0, 2], giving edges 0 -> 1 -> 2 -> 0. That is a cycle, so the answer is False: nothing can go first.
Approach 1: DFS with three colors
The idea is to walk the graph depth-first and watch for a “back edge”, an edge that points to a node you are still in the middle of exploring. A back edge is the signature of a cycle.
To see a back edge you need to distinguish three states for each node, traditionally called colors:
- White (unvisited): you have not started this node.
- Gray (in progress): you have entered this node and are still exploring its descendants. It is on the current DFS path, on the recursion stack right now.
- Black (done): you have fully explored this node and everything reachable from it; it led to no cycle.
The procedure, in words (you write the code):
- Every node starts white.
- To DFS a node: color it gray, then recurse into each of its neighbors.
- If a neighbor is gray, you have found an edge back to a node still on your current path. That is a cycle. Report failure.
- If a neighbor is white, DFS into it; if that call reports a cycle, propagate the failure up.
- A neighbor that is black is already known to be safe; skip it.
- When every neighbor is handled with no cycle found, color the node black and return “no cycle from here”.
- Run this from every still-white node (the graph may be disconnected), and the answer is
Trueonly if no DFS call ever reported a cycle.
Why gray is the whole trick: black means “explored, and provably no cycle through it”, so revisiting a black node is fine and is not a cycle. Gray means “I am still on the path that reached this node”, so an edge to gray closes a loop. Collapsing gray and black into a single visited set, the mistake almost everyone makes first, makes you unable to tell a harmless re-visit from a real cycle, and the detection breaks.
You can represent the colors with a single array (0/1/2) or with two sets (visiting for gray, visited for black); both are fine.
- Time: O(V + E). Each node is colored gray once and black once (so it is entered once), and across the whole run every edge is examined exactly once when you scan its source node’s neighbor list.
- Space: O(V + E). O(V + E) for the adjacency list, O(V) for the color array, and O(V) for the recursion stack in the worst case (a long chain). The dominant term is the graph itself, O(V + E).
Approach 2: Kahn’s algorithm (BFS on in-degrees)
This approach builds a valid order directly, and the cycle reveals itself by what is left over.
The in-degree of a node is how many edges point into it, that is, how many prerequisites it still has outstanding. A node with in-degree zero has no remaining prerequisites, so it can be taken right now.
The procedure, in words (you write the code):
- Compute the in-degree of every node (one pass over all edges, incrementing the target’s count). Use a
defaultdict(int)or an array of lengthnumCourses. - Put every node with in-degree zero into a
deque. These are the courses you can start with. - Repeatedly pop a node from the queue and “take” it (count it as processed). For each neighbor it points to, decrement that neighbor’s in-degree by one (you have now satisfied one of its prerequisites). If a neighbor’s in-degree drops to zero, it is now free, so enqueue it.
- Keep going until the queue is empty.
Now read the result. If you managed to process all numCourses nodes, every node became free at some point, so a full valid order exists and the answer is True. If some nodes were never processed (their in-degree never reached zero), those nodes are caught in a cycle: each is waiting on another that is also waiting, so none can ever be enqueued. A processed-count less than numCourses means a cycle, so the answer is False.
Why it works: removing an in-degree-zero node and decrementing its neighbors is exactly “take a course with no remaining prerequisites and mark that prerequisite satisfied for the courses that needed it”. A DAG always has at least one in-degree-zero node to start (otherwise follow edges backward forever and you find a cycle), and removing it leaves a smaller DAG, so the process drains the whole graph. A cycle has no in-degree-zero node within it, so it can never be entered and stays behind.
- Time: O(V + E). Computing in-degrees scans every edge once, O(E). Each node is enqueued and dequeued at most once, O(V). Each edge is examined exactly once, when its source node is dequeued and you decrement the target, O(E). Total O(V + E).
- Space: O(V + E). O(V + E) for the adjacency list, O(V) for the in-degree table, and O(V) for the queue. The dominant term is the graph, O(V + E).
The two algorithms, side by side
They answer the same question with opposite bookkeeping, and you should be able to say the difference cold:
- DFS three-color explores deep and looks for a back edge (an edge to a gray node still on the current path). The cycle shows up as you hit it during the descent.
- Kahn’s (BFS) peels off nodes that currently have no prerequisites, level by level. The cycle shows up as the nodes you could never peel off.
Same O(V + E) time and O(V + E) space for both. Pick DFS when recursion feels natural and you want the cycle pinpointed during traversal; pick Kahn’s when you also want the topological order as a by-product (the order you dequeued nodes is a valid schedule) or when you want to avoid recursion depth on a deep graph.
The key insight, in one sentence
“Can I finish every course” is the same question as “is this prerequisite graph free of cycles”, and you can answer it either by a DFS that flags an edge back to a node still on the current path (gray) or by repeatedly removing prerequisite-free nodes and checking that none are left stranded. Both visit every node and edge once, so both are O(V + E).
Exact complexity
For both algorithms, with V = numCourses and E = len(prerequisites):
- Time: O(V + E). You build the graph in O(V + E), then touch every node once and every edge once during the traversal. There is no sort and no hidden quadratic factor.
- Space: O(V + E). The adjacency list is O(V + E); the auxiliary structures (color array or in-degree table, plus the recursion stack or the queue) are O(V). The graph representation dominates.
Your work this week
- Build the adjacency list once, then implement both algorithms in the same file in your own work repo (for example
course-schedule.py): one function using DFS three-color, one using Kahn’s. Do not copy a solution; build each from the procedures above. - Run both against the same inputs and confirm they always agree, including the cyclic case (
numCourses = 2, prerequisites[[1, 0], [0, 1]], which isFalse) and an empty-prerequisites case (anynumCourseswith[], which isTrue). Convince yourself they are detecting the same cycle two different ways. - Test the edge cases yourself: no prerequisites at all, a single self-loop (
[[0, 0]], which is a cycle, soFalse), and a disconnected graph where one component has a cycle and another does not. - Commit with a five-question debrief in the message. In answer 3, state the exact O(V + E) complexity above for both algorithms, and say what V and E are for this problem.
When you can explain, from memory and with no AI open, why a gray neighbor in DFS means a cycle, why a leftover node in Kahn’s means a cycle, and why both run in O(V + E), you own this problem. That explanation is exactly what an interviewer will ask for, and topological sort is one of the most frequently asked graph topics.