Week 10: Tries and Intervals

Pattern family: Tries and Intervals (two sub-patterns) · Time budget: 13 hours · Prerequisites: Weeks 1 through 9 done, with backtracking and the grid mark/recurse/unmark rhythm (Week 9), recursion (Week 7), and dict / defaultdict plus sorting with key= from the refresher fresh. · Cold-revisit obligation: Reorder List, plus three problems of your choice from Weeks 5 to 8.

Overview

This is a deliberately lighter week with two distinct, smaller patterns, so you get two recognition reflexes for the price of one. They share nothing algorithmically; what they share is that each is a compact tool you reach for the moment you see its cue, and each one is asked about often enough that it earns its own slot.

A trie (prefix tree) is the structure to reach for when a problem hands you many strings and asks prefix questions: does any word start with this, is this exact word present, find every word matching a pattern. A trie stores the words as a tree of characters where shared prefixes share a path, so a prefix query walks one path instead of scanning every word. The whole idea is that the shape of the data (characters branching off shared prefixes) is the index.

Intervals are the tool when each input element is a pair [start, end] and the question is about how those ranges relate: merge the ones that touch, insert a new one and re-merge, count how many overlap, or fit them into the fewest rooms. The whole idea is almost always the same first move: sort by start, then sweep left to right deciding for each interval whether it extends the one you are building or begins a new one.

Most of the work this week is training two moves that fire before any code. First, when you see many strings and the word “prefix”, “starts with”, “wildcard”, or “all words on a board”, reach for a trie. Second, when you see input shaped as [start, end] and a question about overlap, reach for “sort by start, then sweep”.

flowchart TD
    Q["Read the problem"] --> A{"Is the input MANY STRINGS<br/>with a prefix / wildcard /<br/>autocomplete question,<br/>OR a list of [start, end] pairs<br/>asking about overlap?"}
    A -->|"many strings,<br/>prefix or wildcard"| T["Trie: a tree of characters,<br/>shared prefixes share a path.<br/>Each node is a dict of children<br/>plus an end-of-word flag."]
    A -->|"input is [start, end],<br/>merge or count overlaps"| I["Intervals: sort by start,<br/>then sweep left to right,<br/>extending or pushing each one."]
    A -->|neither| G["Probably a different pattern.<br/>Keep reading the cues."]

Notice: the decision happens before you code, and the very first question splits the week in two. The shape of the input (strings with prefixes vs [start, end] pairs) tells you which sub-pattern you are in. Naming the pattern out loud (principle 1) is what makes this diagram fire under pressure.

Warm-Up: Retrieve Before You Begin

Answer from memory, no peeking. This pulls forward the prior-week tools that both sub-patterns lean on hardest.

  1. From Week 9 and the refresher: in a backtracking search on a grid, what are the three beats of the “mark, recurse, unmark” rhythm, and why does forgetting to unmark corrupt later branches?
  2. From the refresher: what does d.setdefault(key, {}) return, and how does it let you walk-or-create a path of nested dicts one character at a time?
  3. From the refresher: how does sorted(intervals, key=lambda x: x[0]) order a list of [start, end] pairs, and why is sorting by start the natural first move for an overlap question?
Check your recall 1. The three beats are: mark the current cell as visited (or overwrite it with a sentinel), recurse into its neighbors with the mark in place, then unmark it (restore the cell) on the way back out. If you forget to unmark, the cell stays "used" after you leave it, so a later branch that should be free to step on it is blocked, and you get wrong answers, not just slow ones. The unmark restores the state to exactly what it was before you entered, which is what makes the single shared board correct across branches. This rhythm returns this week inside Word Search II. 2. `d.setdefault(key, {})` returns the value at `key` if it already exists, and otherwise inserts `{}` (a fresh empty dict) at `key` and returns that. So `node = node.setdefault(ch, {})` either steps into the existing child for character `ch` or creates an empty child and steps into it. Chaining that over the characters of a word walks down the trie, building any missing nodes as you go, which is exactly how you insert a word. 3. `sorted(intervals, key=lambda x: x[0])` returns a new list of the pairs ordered by their first element (the start), smallest start first, leaving the original list untouched. Sorting by start is the natural first move because once the intervals are in start order, any interval that overlaps a previous one must overlap the most recent one you kept, so a single left-to-right sweep is enough to decide every merge. Without the sort you would have to compare every pair.

Learning objectives

By the end of this week you can:

  • Recognize the trie cue (many strings, prefix or wildcard queries) and the interval cue ([start, end] input, an overlap question) and reach for the right one of the two without being told.
  • Construct a trie as nodes of (children map plus end-of-word flag), supporting insert, exact search, and prefix search, and extend search to a . wildcard with a small DFS.
  • Apply a trie as a pruning device on a grid search, abandoning a path the instant the prefix leaves the trie.
  • Construct the “sort by start, then sweep” solution for merging intervals, and adapt it to insert-and-merge.
  • Analyze the time and space complexity of each: a trie operation as O(length of the word), an interval sweep as O(n log n) dominated by the sort, and the Word Search II search by its trie-bounded DFS.
  • Defend every line of each solution you commit, from memory, with no AI open, including why the sort precedes the sweep and what the end-of-word flag is for.

Recognition cue (Tries)

“Many strings, and the question is about prefixes” almost always means a trie. The surface phrases are “starts with”, “prefix”, “autocomplete”, “wildcard / . matches any character”, “dictionary of words”, and “find all words on a board”. The deeper tell is that you would otherwise scan a whole list of words for every query, and the words share prefixes you could exploit. The moment you notice you are about to compare a query against every word one character at a time, ask whether a trie would let you walk a single shared path instead.

Recognition cue (Intervals)

“Each input element is [start, end], and the question is about how the ranges relate” almost always means the interval pattern. The surface phrases are “merge overlapping”, “insert an interval”, “how many meetings overlap”, “minimum number of rooms / platforms”, and “do any of these conflict”. The deeper tell is that the answer depends on which ranges touch or contain each other, which you cannot see until they are in order. The moment you see [start, end] pairs and an overlap question, your first move is “sort by start”, before you write anything else.

Core concepts to internalize this week

The two sub-patterns are independent, so the concepts below are split into a Tries half and an Intervals half. Internalize each half on its own terms.

Tries

  • A node is a children map plus an end-of-word flag. The cleanest trie node is a dict mapping each next character to its child node, plus a boolean (or a sentinel key) that marks “a word ends here”. The flag is not optional: without it you cannot tell a stored word ("app") from a mere prefix of a longer stored word ("apple"). Insert walks the characters, creating missing children; exact search walks them and then checks the flag; prefix search walks them and does not check the flag.
  • Insert, exact search, and prefix search are the same walk. All three step character by character from the root. Insert creates a child when one is missing; the two searches return false the moment a character is missing. The only difference between exact search and prefix search is the last step: exact search additionally requires the end-of-word flag at the node you land on; prefix search is satisfied just by arriving.
  • Wildcard search is a small DFS. When a query character can be . (match any), a single forward walk no longer works, because . could match several children. At a ., you branch: try every child and succeed if any branch succeeds. That is a depth-first search over the trie, with the query index as the depth. A concrete (non-.) character still narrows to one child, so only the . positions actually fan out.
  • A trie is a pruning tool for grid searches. The week’s canonical, Word Search II, uses a trie not as a lookup table but as a way to prune. You build a trie of all the words you are hunting, then walk the board following trie edges; the instant the path of characters you have spelled is not a prefix in the trie, you stop, because no word can possibly continue that way. The trie turns “search the board once per word” into “search the board once, guided”.

Intervals

  • Sort by start (occasionally by end). Almost every interval problem begins by sorting the intervals by their start value. Once sorted, overlaps become local: an interval can only overlap intervals that started at or before it, so comparing each interval to the one you most recently kept is enough. A few problems (counting non-overlapping intervals to remove) sort by end instead; knowing which key to sort by is part of reading the problem.
  • The extend-or-push sweep. After sorting, sweep left to right holding a “current” interval (the last one in your output). For each next interval, either it overlaps the current one (its start is at or before the current end), in which case you extend the current end to the larger of the two ends, or it does not, in which case you push it as a new current interval. That single decision, repeated, is the whole merge.
  • “Overlap” includes touching, and the comparison is <=. Two intervals [1, 4] and [4, 5] share only the single point 4, and for these problems that counts as overlapping: they merge into [1, 5]. The test is next.start <= current.end, not <. Getting that boundary right (whether touching counts) is the single most common interval bug, so decide it from the problem statement and state it out loud.
  • Insert is merge with a known landing spot. To insert one new interval into an already-sorted, already-merged list, you do not re-sort. You sweep once in three phases: copy all intervals that end before the new one starts (they are untouched), then absorb every interval that overlaps the new one by widening the new one’s start and end, then copy the rest. The output stays sorted and merged. It is the same overlap test, organized around the one interval you are placing.
  • A min-heap of end times counts what is active at once. For “minimum meeting rooms” style questions, you sort by start and keep a min-heap of the end times of meetings still in progress; when the next meeting starts after the earliest end, that room frees up. The heap of end times is the interval pattern’s third common shape, after merge and insert. You drill this shape this week on Car Pooling (practice problem 5), where the question is whether a fixed capacity can hold every interval that is active at the same time.

Common misconception (Tries). “A trie node just needs its children; the structure of the tree already says where words are.” Reality. Without an explicit end-of-word flag, you cannot distinguish a stored word from a prefix of a longer stored word. If you insert "apple" and then search for "app", the path for "app" exists (it is the start of "apple"), so a flag-free trie would wrongly report "app" as a stored word. The node must carry a boolean (or a sentinel child key like "#") that is set only at the end of an inserted word. Exact search checks that flag; prefix search ignores it. State the two halves: a node is its children map and an end marker.

Common misconception (Intervals). “I can sweep the intervals in the order they are given and merge the ones that overlap.” Reality. Overlaps are only local after sorting. In an unsorted list, two intervals that overlap can sit far apart with non-overlapping intervals between them, so a single left-to-right sweep misses merges. Sort by start first; then any overlap is with the most recently kept interval, and one sweep is correct. The discipline is always “sort, then sweep”, in that order, and a touching pair like [1, 4] and [4, 5] merges because the overlap test is <=, not <.

Python idioms this week

You are not yet expected to be fluent, so here is the toolbox this week uses. If any is unfamiliar, open the Python refresher before Monday. The sections you want are dictionaries and setdefault (Section 5), defaultdict (Section 7), sorting with key= (Section 3), recursion for the wildcard and grid DFS (Section 10), tuples as grid coordinates (Section 4), and heapq if you explore room-counting (Section 8).

  • A nested dict for trie nodes (node = node.setdefault(ch, {}) to walk-or-create), or a small node class with a children dict and an is_end boolean. collections.defaultdict is an alternative for the children map.
  • A sentinel key such as "#" to mark end-of-word inside a dict node (just make sure it cannot collide with a real character), or an explicit boolean flag if you use a node class.
  • Recursion for the . wildcard search (branch over all children at a .) and for the Word Search II grid DFS (the Week 9 mark/recurse/unmark rhythm, now guided by trie edges).
  • sorted(intervals, key=lambda x: x[0]) to order intervals by start; key=lambda x: x[1] on the rare problem that sorts by end.
  • List operations to build the output: append a new interval, or update the end of the last appended interval (out[-1][1] = max(out[-1][1], end)).
  • heapq.heappush / heappop for a min-heap of end times if you go after the room-counting variant.

These are tools you are allowed to know cold in an interview. The pattern (which structure the problem wants, why the sort precedes the sweep, what the end-of-word flag is for) is the part you build yourself.

Pattern Card requirement

Because this week has two distinct patterns, you write two Pattern Cards before you solve the canonical problem: one at .tutor/pattern-cards/tries.md and one at .tutor/pattern-cards/intervals.md. Each must contain:

  • The recognition cue for that sub-pattern, in your own words.
  • The algorithmic skeleton in pseudocode (for tries: insert / search / prefix-search as the shared walk, plus the wildcard branch; for intervals: sort, then the extend-or-push sweep).
  • Time and space complexity, general form (for a trie: per-operation O(word length), space O(total characters stored); for intervals: O(n log n) dominated by the sort, O(n) output).
  • Three edge cases (for tries: an empty trie, the empty string, and a stored word that is a prefix of another are good starts; for intervals: an empty list, a single interval, and intervals that merely touch are good starts).
  • One war story (a bug you actually hit this week); fill this in at the end of the week.

The first four sections of both cards must be present before you start the canonical deep-dive. The tutor will probe both cards; it will not write any of either for you.

Canonical deep-dive

Problem: Word Search II (https://leetcode.com/problems/word-search-ii/)

Word Search II is this course’s home for the fusion of two patterns: the trie from this week and the backtracking grid search from Week 9. You build a trie of the words you are hunting, then DFS the board following trie edges, pruning a path the instant its prefix leaves the trie. The full guided walk-through, with the trie-as-pruning insight, the mark/recurse/unmark rhythm carried over from Week 9, and the exact complexity, is in canonical/word-search-ii.md. Work it there. You will implement it yourself, in your own work repo, and write a short debrief. The walk-through is deliberately not an answer key: it shows you how to think, and stops short of code you could paste.

Practice set

Five problems, the two tries first, then three intervals. Each has its own folder under practice/ with the problem link, a complexity target for you to fill in, and the debrief template. Do not skip ahead.

Practice set (Tries)

  1. Implement Trie (Prefix Tree) (practice/implement-trie/). A design problem: build the Trie class with insert, search, and startsWith. This is where the node-as-(children plus end-flag) and the shared walk become muscle memory.
  2. Design Add and Search Words (practice/design-add-and-search-words/). A design problem: build the WordDictionary class with addWord and a search that supports the . wildcard. This is where exact search grows into the small DFS that branches at every ..

Practice set (Intervals)

  1. Merge Intervals (practice/merge-intervals/). The base case of the pattern: sort by start, then sweep, extending or pushing. The <= boundary (touching counts) lives here.
  2. Insert Interval (practice/insert-interval/). The same overlap logic organized around one new interval, in three phases (before, overlapping, after), without re-sorting.
  3. Car Pooling (practice/car-pooling/). The interval pattern’s third shape: fit-into-capacity over time. The question is no longer how the ranges combine but how many are active at once, and whether that peak ever breaks a fixed limit. This is where the cue you recognize (the heap-of-end-times shape) finally becomes a reflex you have implemented.

Each problem follows the five-beat rhythm.

The five-beat rhythm (every problem)

  1. Pattern Card is written (you did this Monday, both of them).
  2. Name the pattern aloud and write your approach as a plain-English comment before any code.
  3. Struggle floor: 25 minutes unaided.
  4. Hint ladder if needed: six rungs, one per ask.
  5. Debrief: the five questions, in your commit message, before the next problem.

Reflect

Ten minutes in your notebook, in writing:

  • Explain it back: describe to a peer who has only done arrays and backtracking so far what a trie is and why its end-of-word flag is necessary, in three sentences; then describe the “sort, then sweep” interval move in two sentences.
  • Connect: Word Search II uses a trie and the Week 9 grid search together. In one sentence, what does the trie contribute that plain backtracking on the board does not?
  • Monitor: which of the two patterns is still fuzzier about when to reach for it? Write the one sentence that would clear it up.

Knowledge Check

Answer from memory first. Items marked ⟲ are spaced callbacks to earlier material.

  1. What input shape or phrase should make you reach for a trie, and what input shape or phrase should make you reach for the interval pattern? Give one of each.
  2. In a trie, what is the end-of-word flag for, and what concretely goes wrong if you omit it?
  3. For Merge Intervals, what is the first move, what is the per-interval decision in the sweep, and why does a touching pair like [1, 4] and [4, 5] merge?
  4. ⟲ From Week 9 (Backtracking): Word Search II reuses the grid mark/recurse/unmark rhythm. What does the trie add on top of that rhythm, and why does it make the search faster?
  5. ⟲ From the refresher: why is d.setdefault(ch, {}) a convenient way to build a trie one character at a time, and what does it return when ch is already a child?
Answer key 1. A trie: many strings with a prefix, wildcard, or autocomplete question (for example "does any word start with this", "find all words on a board"). The interval pattern: input shaped as `[start, end]` pairs with an overlap question (for example "merge overlapping intervals", "minimum number of rooms"). 2. The end-of-word flag marks the nodes where an inserted word actually ends, so exact search can distinguish a stored word from a mere prefix of a longer stored word. If you omit it, searching for `"app"` after inserting only `"apple"` wrongly returns true, because the path for `"app"` exists as the start of `"apple"`. Exact search checks the flag; prefix search does not. 3. The first move is to sort the intervals by start. The per-interval decision is: if this interval's start is at or before the current (last kept) interval's end, extend that end to the larger of the two ends; otherwise push this interval as a new current one. `[1, 4]` and `[4, 5]` merge because the overlap test is `<=` (touching counts), and `4 <= 4` is true, so they combine into `[1, 5]`. 4. The mark/recurse/unmark rhythm walks the board exploring paths and restoring each cell on the way out. The trie adds a guide: at each step you follow only trie edges, so the moment the characters you have spelled are not a prefix of any target word, you stop and back out. Without the trie you would re-search the board once per word and explore long dead paths; with it you search once and prune every path that cannot spell a word. 5. `d.setdefault(ch, {})` returns the existing child node for `ch` if there is one, and otherwise inserts a fresh empty dict at `ch` and returns it. So `node = node.setdefault(ch, {})` walks into the child for `ch`, creating it if missing, which lets you descend the trie and build any absent nodes in a single expression per character. When `ch` is already a child, it returns that existing child unchanged (it does not overwrite it).

Cold revisit

This Friday, before any new work, re-solve four prior problems cold: no notes, no prior code open, no AI, the standard 25-minute floor on each. This week’s obligation is larger than usual (one fixed problem plus three of your choice) because the week itself is lighter. The full instructions are in revisit.md.

  1. Reorder List (Week 6, Linked List). A blind re-solve. Do not look at which pattern it is until you have named it yourself; recognizing it cold is the point.
  2. Three problems of your choice from Weeks 5 to 8. Pick the ones you feel least sure of, blind re-solve each under the same rules.

None of these is a trie or interval problem, and that is deliberate: the cold revisit trains recognition across the whole course, not just this week’s patterns. Debrief each at the end and log the outcome in .tutor/revisit-log.md.

How to know you are done with this week

  • Both Pattern Cards complete: .tutor/pattern-cards/tries.md and .tutor/pattern-cards/intervals.md, each including the war-story field.
  • Canonical deep-dive done: Word Search II implemented yourself (a trie of the words plus a trie-guided grid DFS with a correct unmark), plus your short debrief (in your work repo).
  • Five practice problems solved, each with a five-question debrief in its commit message.
  • .tutor/revisit-log.md updated by the tutor with this week’s problems and the four cold revisits, with dates.
  • .tutor/progress.md updated to “Week 10 complete; ready for Week 11.”

If any of the above is missing, the week is not done. The tutor will not advance you to Week 11 until all five are present.

Resources

Free, no account required:

  • [Article] Trie (prefix tree), the idea: https://en.wikipedia.org/wiki/Trie
  • [Article] Interval scheduling: https://en.wikipedia.org/wiki/Interval_scheduling
  • [Docs] Python dict and dict.setdefault: https://docs.python.org/3/library/stdtypes.html#dict
  • [Docs] Python collections (defaultdict): https://docs.python.org/3/library/collections.html
  • [Docs] Python heapq (for the room-counting variant): https://docs.python.org/3/library/heapq.html

Table of contents