Week 11: Graphs
Pattern family: Graphs · Time budget: 14 hours · Prerequisites: Weeks 1 through 10 done, with collections.deque and BFS (Week 7), recursion (Week 7), and the visited set idea fresh. · Cold-revisit obligation: Binary Tree Maximum Path Sum, N-Queens.
Overview
A graph is just nodes and the edges between them, and an enormous number of problems are graphs wearing a costume. A grid of land and water is a graph (each cell is a node, neighbors are edges). A list of course prerequisites is a graph (each course is a node, “A before B” is a directed edge). A set of words one letter apart is a graph (each word is a node, an edge joins words that differ by a single character). The skill this week is seeing the graph under the costume, then reaching for one of two traversals: depth-first search (DFS) or breadth-first search (BFS).
Both traversals visit every reachable node exactly once, and both depend on the same non-negotiable rule: a visited set. The instant you forget it, a traversal walks in circles forever, because graphs (unlike trees) can have cycles and can reach the same node by many paths. Most of the work this week is training two moves that fire before any code: first, recognize that a problem is a graph and decide what the nodes and edges are; second, pick DFS or BFS for a reason you can say out loud.
flowchart TD
Q["Read the problem"] --> A{"Is it about connections,<br/>reachability, components,<br/>cycles, or a grid where<br/>cells connect to neighbors?"}
A -->|no| G["Probably a different pattern.<br/>Keep reading the cues."]
A -->|yes| B["It is a graph. Name the nodes<br/>and the edges out loud."]
B --> C{"Do I need the SHORTEST<br/>path or fewest steps in an<br/>unweighted graph?"}
C -->|yes| BFS["Use BFS: a deque, level by level,<br/>first arrival is the shortest."]
C -->|"no, I just need to<br/>reach / mark everything"| DFS["Use DFS or BFS: recursion or a<br/>stack/deque. Either visits all<br/>reachable nodes. Always carry visited."]
Notice: the decision happens before you code, and it has two beats. First, is this a graph at all, and what are the nodes and edges? Second, does the question ask for a shortest path or fewest steps (BFS), or just to reach and mark everything (either works)? Naming the pattern out loud (principle 1) is what makes this diagram fire under pressure.
Warm-Up: Retrieve Before You Begin
Answer from memory, no peeking. This pulls forward the prior-week tools that graphs lean on hardest.
- From Week 7 and the refresher: what does
collections.dequegive you that a plainlistdoes not, and which method does a BFS queue call to take the next node? - From the refresher: how do you build a
defaultdict(list), and why is it the natural structure for an adjacency list? - From Week 7: when you write a recursive DFS, what is the base case that stops it, and what state must travel along so you do not revisit a node?
Check your recall
1. `collections.deque` gives O(1) append and pop at *both* ends; a `list` is O(n) to pop from the front (`pop(0)`), because every other element has to shift down. A BFS queue calls `popleft()` to take the oldest node (first in, first out), which is exactly what processes the graph level by level. 2. `from collections import defaultdict; g = defaultdict(list)`. A missing key auto-creates an empty list, so `g[u].append(v)` works without first checking whether `u` is a key. That is exactly the adjacency-list shape: each node maps to the list of its neighbors, and you can append an edge to any node without a guard clause. 3. A recursive DFS stops when it has no unvisited neighbor to descend into (every neighbor is already visited or out of bounds). The state that must travel along is the `visited` set (or, for a grid, marking the cell in place). Without it, two nodes that point at each other send the recursion back and forth until it overflows the stack.Learning objectives
By the end of this week you can:
- Recognize the graph cue (connections, reachability, components, cycles, or a grid whose cells connect to neighbors) and name the nodes and edges without being told.
- Choose BFS for shortest-path-or-fewest-steps in an unweighted graph and DFS or BFS for reach-and-mark, and justify the choice in one sentence.
- Construct a DFS (recursive or stack-based) and a BFS (deque-based) that each carry a
visitedset and terminate on any graph, including one with cycles. - Implement topological-sort cycle detection two ways: DFS three-color and Kahn’s algorithm on in-degrees.
- Analyze the time and space complexity of a traversal as O(V + E), and say what V and E are for the problem in front of you.
- Defend every line of each solution you commit, from memory, with no AI open, including why the
visitedset is mandatory.
Recognition cue
“Connections,” “reach,” “components,” “cycles,” or a grid-shaped problem that is really about cells touching their neighbors, almost always means a graph traversal. The surface phrases are “is there a path from A to B”, “how many separate groups / islands / clusters”, “can all of this be ordered / scheduled”, “does this contain a cycle”, and “fewest steps / shortest transformation”. The deeper tell is that the data is a set of things plus relationships between them, even when no one used the word “graph”. The moment you catch that, stop and name the nodes and the edges before you write anything.
Core concepts to internalize this week
- Three ways a graph is handed to you. As an adjacency list (a
dictordefaultdict(list)mapping each node to its neighbors), the default and almost always what you want to build. As an adjacency matrix (a 2-D table wherem[i][j]is 1 if there is an edge), compact for dense graphs but O(V^2) space. Or implicit, where the graph is never spelled out and you generate neighbors on the fly: a grid cell’s neighbors are the four cells around it; a word’s neighbors are the words one letter away. Recognizing an implicit graph is half the battle on grid and word problems. - DFS vs BFS, and how to choose. DFS goes as deep as it can before backtracking; it is natural as recursion and is the right tool for “reach everything”, “count components”, and cycle detection. BFS fans out level by level using a
deque; it is the right tool when you need the shortest path or the fewest steps in an unweighted graph, because the first time BFS reaches a node it has reached it by a shortest path. If the problem says “fewest”, “shortest”, or “minimum number of moves” and edges are unweighted, reach for BFS. - The
visitedset is mandatory. Trees have no cycles, so Week 7 traversals could skip the bookkeeping. Graphs can have cycles and can reach a node by many paths, so every traversal must mark a node the first time it is seen and never process it again. Mark a node visited at the moment you enqueue or descend into it, not when you pop it, or the same node can be enqueued many times before it is processed. - Connected components. To count separate groups, loop over every node; each time you find an unvisited one, start a fresh traversal from it (that traversal floods one whole component and marks it), and add one to your count. Number of Islands is exactly this with a grid as the graph.
- Cycle detection and topological sort. A directed graph can be linearized into an order that respects every edge (every prerequisite before the thing that needs it) if and only if it has no cycle. There are two standard ways to detect the cycle, and the canonical deep-dive teaches both: DFS with three colors (a back edge to a node still on the current path means a cycle) and Kahn’s algorithm (repeatedly remove nodes with no remaining prerequisites; if any node is left over, a cycle trapped it).
- The “precompute neighbor patterns” trick. In Word Ladder the implicit edges (words one letter apart) are expensive to find by comparing every pair of words. Instead, precompute a map from a wildcard pattern like
h*tto all words matching it (hot,hit); two words are neighbors exactly when they share a pattern. This turns neighbor lookup from “scan all words” into a dictionary hit and is the difference between a slow and a fast solution.
Common misconception. “A traversal naturally stops on its own, so I can skip the
visitedset on simple graphs.” Reality. A tree traversal stops because a tree has no way back; a graph traversal does not. Two nodes that reference each other, or any cycle, send avisited-free DFS bouncing between them until the recursion stack overflows, and avisited-free BFS enqueues the same nodes forever. Thevisitedset is not an optimization; it is what makes graph traversal terminate. State that out loud every time: “I mark visited so I never process a node twice.”
Common misconception. “Adjacency matrix and adjacency list are interchangeable; pick either.” Reality. They have different costs. A matrix is O(V^2) space and answers “is there an edge u to v” in O(1), which suits dense graphs. A list is O(V + E) space and lets you iterate a node’s neighbors in time proportional to how many it has, which suits the sparse graphs that almost every interview problem hands you. Defaulting to an adjacency list (
defaultdict(list)) is the right habit; reach for a matrix only when the graph is dense or you need O(1) edge tests.
Python idioms this week
You are not yet expected to be fluent, so here is the toolbox this week uses. If any is unfamiliar, open the Python refresher before Monday.
collections.defaultdict(list)to build an adjacency list without guard clauses (graph[u].append(v)).collections.dequefor the BFS queue:appendto enqueue,popleftto dequeue (O(1) at both ends).- A
visitedset (or marking a grid cell in place) so each node is processed once;(row, col)tuples are the usual way to store grid cells in a set. - Direction tuples for grid neighbors:
for dr, dc in ((1, 0), (-1, 0), (0, 1), (0, -1)):, thennr, nc = r + dr, c + dcwith a bounds check. - Recursion for DFS, with the recursion-depth limit in mind (a large grid can need an explicit stack instead).
collections.defaultdict(int)for in-degree counts in Kahn’s algorithm.
These are tools you are allowed to know cold in an interview. The pattern (seeing the graph, choosing the traversal, knowing why visited is required) is the part you build yourself.
Pattern Card requirement
Before you solve the canonical problem, write your Pattern Card at .tutor/pattern-cards/graphs.md. It must contain:
- The recognition cue, in your own words.
- The algorithmic skeleton in pseudocode (a generic traversal with a
visitedset, plus a one-line note on when you swap DFS for BFS). - Time and space complexity, general form (state it as O(V + E) and say what V and E are).
- Three edge cases (an empty graph, a single node with no edges, and a disconnected graph or a graph with a cycle are good starts).
- One war story (a bug you actually hit this week); fill this in at the end of the week.
The first four sections must be present before you start the canonical deep-dive. The tutor will probe the card; it will not write any of it for you.
Canonical deep-dive
Problem: Course Schedule (https://leetcode.com/problems/course-schedule/)
Course Schedule is this course’s home for topological sort and cycle detection, and it is unusual in that you will learn two algorithms for the same question: DFS three-color coloring and Kahn’s algorithm on in-degrees. The full guided walk-through, with both algorithms spelled out as relations and the exact O(V + E) complexity, is in canonical/course-schedule.md. Work it there. You will implement both yourself, in the same file in your own work repo, confirm they agree, and write a short debrief. The walk-through is deliberately not an answer key: it shows you how to think, and stops short of code you could paste.
Practice set
Four problems, in this order. Each has its own folder under practice/ with the problem link, a complexity target for you to fill in, and the debrief template. Do not skip ahead.
- Number of Islands (
practice/number-of-islands/). A grid as an implicit graph; count connected components by flooding each one. The cleanest first graph problem. - Clone Graph (
practice/clone-graph/). Traverse a graph while building a parallel copy; adictfrom original node to its clone is the whole trick. A taste of the design-flavored problems. - Pacific Atlantic Water Flow (
practice/pacific-atlantic-water-flow/). Two multi-source traversals, one from each ocean, then intersect. Reversing the direction of “flow” is the insight. - Word Ladder (
practice/word-ladder/). Shortest transformation as a BFS over an implicit word graph; precompute neighbor patterns so it runs fast.
Each problem follows the five-beat rhythm.
The five-beat rhythm (every problem)
- Pattern Card is written (you did this Monday).
- Name the pattern aloud and write your approach as a plain-English comment before any code.
- Struggle floor: 25 minutes unaided.
- Hint ladder if needed: six rungs, one per ask.
- Debrief: the five questions, in your commit message, before the next problem.
Reflect
Ten minutes in your notebook, in writing:
- Explain it back: describe to a peer who has only done trees so far why a graph traversal needs a
visitedset but a tree traversal does not, in three sentences. - Connect: Number of Islands and Course Schedule are both graph problems, yet one counts components and the other detects a cycle. What does each use for its nodes and edges, and which traversal fits each?
- Monitor: which is still fuzzy, deciding DFS vs BFS, or knowing when a problem is a graph at all? Write the one sentence that would clear it up.
Knowledge Check
Answer from memory first. Items marked ⟲ are spaced callbacks to earlier material.
- What words or input shapes in a problem statement should make you reach for a graph traversal, and what two things do you name first once you spot one?
- When do you choose BFS over DFS, and what property of BFS makes it correct for that case?
- State the two ways to detect a cycle in a directed graph that the canonical teaches, in one sentence each.
- ⟲ From Week 7 and the refresher: why is
collections.dequethe right structure for a BFS queue, and what is wrong with usinglist.pop(0)instead? - ⟲ From Week 9 (Backtracking): backtracking also explores choices recursively, like DFS. What does a graph DFS carry that a from-scratch backtracking search may not, and why?
Answer key
1. "Connections," "reach," "components," "cycles," "can this be ordered / scheduled," "fewest steps / shortest transformation," and any grid where cells connect to their neighbors. The two things you name first are the nodes and the edges (what is a node, and what makes two nodes adjacent). 2. Choose BFS when you need the shortest path or the fewest steps in an *unweighted* graph. BFS explores level by level, so the first time it reaches a node it has done so by a path of the fewest edges; that first-arrival-is-shortest property is what makes it correct. (For weighted graphs you need Dijkstra, which is Week 12.) 3. DFS three-color: color nodes white (unseen), gray (on the current path), black (done); if DFS ever reaches a gray node, that is a back edge to a node still being explored, which means a cycle. Kahn's algorithm: repeatedly remove nodes whose in-degree is zero and decrement their neighbors; if you cannot remove all nodes (some keep a positive in-degree), the leftovers form a cycle. 4. `deque` gives O(1) `popleft()` from the front; BFS dequeues from the front every step, so the queue operation stays O(1). `list.pop(0)` is O(n) because every remaining element shifts down one slot, which turns the whole BFS into O(V^2) or worse on the queue operations alone. 5. A graph DFS carries a `visited` set so it never re-enters a node, because a graph can reach the same node by many paths and can contain cycles. A pure permutation/subset backtracking search over distinct positions does not always need one (each choice is a fresh position), but the moment the search graph can revisit a state, it needs the same guard. The shared idea: track what you have already entered so the recursion terminates.Cold revisit
This Friday, before any new work, re-solve two prior problems cold: no notes, no prior code open, no AI, the standard 25-minute floor on each. The full instructions are in revisit.md.
- Binary Tree Maximum Path Sum (Week 7, Trees). A blind re-solve. Do not look at which pattern it is until you have named it yourself; recognizing it cold is the point. This is the return-vs-record skill from Week 7, the hardest tree concept, and you have not re-tested it since.
- N-Queens (Week 9, Backtracking). A blind re-solve, same rules.
Neither is a graph problem, and that is deliberate: the cold revisit trains recognition across the whole course, not just this week’s pattern. Debrief each at the end and log the outcome in .tutor/revisit-log.md.
How to know you are done with this week
- Pattern Card complete at
.tutor/pattern-cards/graphs.md, including the war-story field. - Canonical deep-dive done: Course Schedule implemented yourself both ways (DFS three-color and Kahn’s), confirmed to agree, with your short debrief (in your work repo).
- Four practice problems solved, each with a five-question debrief in its commit message.
.tutor/revisit-log.mdupdated by the tutor with this week’s problems and the two cold revisits, with dates..tutor/progress.mdupdated to “Week 11 complete; ready for Week 12.”
If any of the above is missing, the week is not done. The tutor will not advance you to Week 12 until all five are present.
Resources
Free, no account required:
- [Article] Graph (abstract data type), the idea: https://en.wikipedia.org/wiki/Graph_(abstract_data_type)
- [Article] Breadth-first search: https://en.wikipedia.org/wiki/Breadth-first_search
- [Article] Depth-first search: https://en.wikipedia.org/wiki/Depth-first_search
- [Article] Topological sorting (DFS and Kahn’s): https://en.wikipedia.org/wiki/Topological_sorting
- [Docs] Python
collections(deque,defaultdict): https://docs.python.org/3/library/collections.html