Week 5: Binary Search

Pattern family: Binary Search · Time budget: 14 hours · Prerequisites: Weeks 1 through 4 done, with the array and dict reflexes from Week 1 and the careful boundary reasoning from the monotonic-stack work in Week 4 fresh. · Cold-revisit obligation: Top K Frequent Elements, Daily Temperatures.

Overview

Binary search is the pattern that turns O(n) into O(log n) by throwing away half of the search space at every step. The textbook version (find a target in a sorted array) is the easy half of this week, and you probably already half-know it. The hard half, and the real upgrade, is learning to see a binary search where there is no sorted array in front of you at all.

The deeper idea is this: binary search does not need a sorted array. It needs a sorted space and a yes-or-no question whose answer flips exactly once as you move through that space. If you can write a predicate feasible(x) that is False for small x and True for all x at or above some threshold (or the other way around), then the thresholds are sorted even if your input is not, and you can binary search for the boundary. That reframing, called “binary search on the answer”, is the single most useful move you will learn in this course, and most of this week trains the reflex that fires before any code: when you catch yourself thinking “I could check whether a given value works, and bigger values only make it easier”, stop and binary search the smallest value that works.

flowchart TD
    Q["Read the problem"] --> A{"Is the input sorted,<br/>or can I sort it cheaply?"}
    A -->|yes| H["Classic binary search:<br/>halve the index range,<br/>O(log n)."]
    A -->|no| B{"Can I write feasible(x):<br/>a yes/no test that flips<br/>once as x increases?"}
    B -->|yes| C["Binary search on the ANSWER:<br/>search the value space for the<br/>smallest x where feasible(x) is True."]
    B -->|no| G["Probably a different pattern.<br/>Keep reading the cues."]

Notice: the decision happens before you code, and it has two beats. First, is there a sorted array (search the index)? Second, even without one, is there a monotonic predicate (search the answer)? Naming which of the two you are doing out loud (principle 1) is what keeps your boundaries straight under pressure.

Warm-Up: Retrieve Before You Begin

Answer from memory, no peeking. This pulls forward the Week 0 refresher and the boundary reasoning that this week leans on hardest.

From the refresher: why must the midpoint be mid = (lo + hi) // 2 and never (lo + hi) / 2?
From the refresher: on a sorted list a, what is the difference between bisect.bisect_left(a, x) and bisect.bisect_right(a, x), and what question does each one answer?
From the refresher: what is float('inf'), and why is it a safe value to compare against any real number you might see in the input?

Check your recall

1. `/` always produces a `float` in Python, and a `float` cannot index a list. `mid = (lo + hi) / 2` gives something like `2.5`, and `nums[2.5]` raises `TypeError`. Floor division `//` keeps `mid` an `int`, so `(lo + hi) // 2` is the only correct form for an index. This is the most common off-by-something bug in the whole pattern, and it is not even an off-by-one; it is a type error. 2. `bisect_left(a, x)` returns the leftmost index where `x` could be inserted to keep `a` sorted, which is the index of the first element greater than or equal to `x`. `bisect_right(a, x)` returns the rightmost such index, the first element strictly greater than `x`. So `bisect_left` answers "where does `x` start (or where would it go)?" and `bisect_right` answers "where does the run of `x` end?". On a list with no duplicates of `x` they differ by zero or one; on a list with several copies of `x` the gap between them is exactly the count of `x`. 3. `float('inf')` is positive infinity: a float that compares as larger than every finite number. `float('-inf')` is its negative twin, smaller than everything. They are the standard sentinels for "no value here yet" on the boundaries of a search, so that a comparison like `Aleft <= Bright` simply works at the edges of an array without a special case.

Learning objectives

By the end of this week you can:

Recognize both binary-search cues: a sorted (or sortable) input, and a monotonic feasible(x) predicate over an answer space that is not itself a sorted array.
Construct a correct classic binary search with boundaries you can defend, and state which of the standard templates (inclusive lo <= hi, or the half-open lo < hi) you used and why.
Distinguish bisect_left from bisect_right and pick the right one for “first index at or above x” versus “first index past x”.
Apply binary search to a rotated sorted array by deciding at each step which half is sorted, and to two sorted arrays by searching for the correct partition.
Construct a “binary search on the answer” solution: define the answer space, write the monotonic feasible predicate, and search for its boundary.
Analyze the time and space complexity of each, naming the log factor and what it is the log of (the array length, or the size of the answer range).
Defend every line of each solution you commit, from memory, with no AI open, including why your loop terminates and why your boundaries are correct.

Recognition cue

“The input is sorted” is the obvious cue, and you should reach for binary search on sight of a sorted array and a lookup. The harder, more valuable cue is this: the answer space itself can be ordered, even when the input is not. If you can write a function feasible(x) that returns a yes-or-no answer, and that answer flips exactly once as x increases (everything below a threshold is False, everything at or above it is True, or the mirror image), then the smallest (or largest) feasible x is findable by binary search. The surface phrases that should make you test for this are “minimum capacity / speed / size such that …”, “smallest value that lets us …”, “largest value for which we can still …”. The moment you notice that checking one candidate answer is easy and that bigger candidates can only help, you are looking at binary search on the answer.

Core concepts to internalize this week

The invariant and the loop shape. Binary search maintains a range [lo, hi] (or [lo, hi)) that is guaranteed to still contain the answer. Every iteration shrinks it. There are two templates worth owning: the inclusive while lo <= hi form that returns from inside the loop (best for “find this exact target”), and the half-open while lo < hi form that converges lo and hi onto a single boundary index (best for “find the first position where a condition becomes true”). Pick one form per problem and keep it consistent; mixing them is how boundaries rot.
bisect_left vs bisect_right. These are the standard library’s two binary searches, and they encode the boundary choice for you: bisect_left finds the first index at or above x, bisect_right the first index strictly past x. Knowing which you need is half of the boundary battle, and writing a feasible-style search by hand is the same decision in disguise.
Rotated sorted arrays. A sorted array rotated at an unknown pivot is no longer globally sorted, but at any midpoint, at least one of the two halves [lo, mid] and [mid, hi] is still sorted. Compare nums[mid] against an endpoint to decide which half is the clean, sorted one, check whether the target lies inside that clean half’s range, and recurse into the half that must contain it. The same “which half is sorted” decision finds the minimum (the pivot) when there is no target at all.
Binary search on a continuous or large answer space. When the thing you are searching is not array indices but candidate answers (an eating speed, a capacity, a number of days), the search space can be huge or even continuous. You still halve it the same way. The cost becomes O(log(range)) iterations, each iteration paying whatever the feasible check costs.
The partition trick for two sorted arrays. The median of two sorted arrays splits the combined data into a left half and a right half of known sizes. You do not merge; you binary search for where to cut the smaller array so that everything on the left is no larger than everything on the right. That cut determines the cut in the other array by simple arithmetic, which is why the search is over the smaller array and the cost is O(log(min(m, n))).

Common misconception. “Binary search is for sorted arrays, so if the input is not sorted, the pattern does not apply.” Reality. Binary search needs a monotonic structure, not literally a sorted array. The answer space of “what is the smallest speed that finishes in time?” is monotonic (slow speeds fail, fast speeds succeed, and the flip happens once) even though the list of banana piles is in no particular order. Sorting the input is one way to create monotonicity; a feasible(x) predicate is the other, and it is the one that unlocks the hard problems. Reciting “binary search means sorted array” is exactly what keeps people from seeing the second half of this pattern.

Common misconception. “The boundary details are fiddly, so I will just write mid = (lo + hi) / 2, scan a bit, and adjust until the examples pass.” Reality. The two bugs that sink binary search are both avoidable by rule, not by fiddling. First, the midpoint is (lo + hi) // 2, integer floor division, never /2, which produces a float that cannot index a list. Second, every off-by-one comes from being sloppy about whether your range is inclusive [lo, hi] or half-open [lo, hi) and whether you move past mid (lo = mid + 1) or keep it (hi = mid). Decide the template up front, write the invariant in a comment (“the answer is always in [lo, hi]”), and the boundaries follow. “Adjust until the examples pass” produces code that works on the two given cases and fails on the empty array, the single element, or the target at the very edge.

Heavy concept ahead. “Binary search on the answer” is the load-bearing idea of this week, and the upgrade that separates people who can only do textbook binary search from people who reach for it on problems that never mention sorting. Internalize it as a four-step recipe, in this order, every time:

Identify the answer space. What single number are you choosing? Its smallest and largest possible values are your lo and hi. (For Koko, the eating speed, from 1 to max(piles).)

Write feasible(x). A function that returns True if answer x is good enough (achieves the goal within the constraints) and False otherwise. (For Koko, “can Koko finish all the bananas within h hours if she eats at speed x?”)

Confirm monotonicity. The whole method is valid only if feasible flips exactly once: all False then all True (or all True then all False). Say out loud why it is monotonic. (For Koko, a faster speed never takes more hours, so once a speed is fast enough, every faster speed is too.)

Binary search the boundary. Search the answer space for the smallest x with feasible(x) == True (or the largest with feasible(x) == True, depending on the problem). Use the half-open template that converges on the boundary. Write these four lines down, in words, for the canonical problem and for any practice problem where the input is not a plain sorted array. Skipping step 3 (asserting monotonicity without checking it) is the single most common way these solutions come out wrong.

Python idioms this week

You are not yet expected to be fluent, so here is the toolbox this week uses. If any is unfamiliar, open the Python refresher before Monday.

The integer midpoint mid = (lo + hi) // 2, with // floor division so mid stays an int. Never (lo + hi) / 2.
bisect.bisect_left(a, x) and bisect.bisect_right(a, x) for the standard-library binary search, when a problem is cleanly “find the insertion point in this sorted list”. (Knowing these exist also tells you when not to hand-roll a search.)
float('inf') and float('-inf') as sentinels on the edges of a partition, so the boundary comparisons work without special-casing the ends of an array.
math.ceil(a / b) for “how many full buckets do I need”, which is the per-pile hours in the canonical problem. (Equivalently (a + b - 1) // b with integer arithmetic; know both.)
max(...) over a list to set the upper bound of an answer space (hi = max(piles)).

These are tools you are allowed to know cold in an interview. The pattern (the invariant, which half to keep, the feasible predicate and why it is monotonic) is the part you build yourself.

Pattern Card requirement

Before you solve the canonical problem, write your Pattern Card at .tutor/pattern-cards/binary-search.md. It must contain:

The recognition cue, in your own words, covering both the sorted-input cue and the monotonic-predicate (“search the answer”) cue.
The algorithmic skeleton in pseudocode: both a classic index search and the four-step “binary search on the answer” recipe.
Time and space complexity, general form (the log factor, and what it is the log of: array length versus answer range).
Three edge cases (empty input, a single element, and the target at the very first or very last position are good starts).
One war story (a bug you actually hit this week, very likely a boundary or a midpoint); fill this in at the end of the week.

The first four sections must be present before you start the canonical deep-dive. The tutor will probe the card; it will not write any of it for you.

Canonical deep-dive

Problem: Koko Eating Bananas (https://leetcode.com/problems/koko-eating-bananas/)

Koko Eating Bananas is this course’s home for “binary search on the answer”, the week’s key upgrade. There is no sorted array in the problem at all; the search space is the eating speed, and the trick is seeing that “can Koko finish in time at speed x?” is a monotonic predicate you can binary search. The full guided walk-through, with the answer space, the feasible predicate, the monotonicity argument, and the exact complexity, is in canonical/koko-eating-bananas.md. Work it there. You will implement it yourself, in your own work repo, and write a short debrief. The walk-through is deliberately not an answer key: it shows you how to think, and stops short of code you could paste.

Practice set

Four problems, in this order. Each has its own folder under practice/ with the problem link, a complexity target for you to fill in, and the debrief template. Do not skip ahead. The order is deliberate: it walks you from the textbook search, through the rotated-array variations, to the hardest partition search in the set.

Binary Search (practice/binary-search/). The textbook version: find a target in a sorted array. Get your template and your boundaries clean here, because everything else this week builds on them.
Search in Rotated Sorted Array (practice/search-in-rotated-sorted-array/). The array is sorted then rotated at an unknown pivot. Decide at each step which half is sorted, and search the half that must hold the target.
Find Minimum in Rotated Sorted Array (practice/find-minimum-in-rotated-sorted-array/). The same “which half is sorted” decision, but now there is no target: you are searching for the pivot, the one element smaller than its predecessor.
Median of Two Sorted Arrays (practice/median-of-two-sorted-arrays/). The hardest in the set. Do not merge the arrays; binary search the smaller one for the correct partition. Expect to spend real time here.

Each problem follows the five-beat rhythm.

The five-beat rhythm (every problem)

Pattern Card is written (you did this Monday).
Name the pattern aloud and write your approach as a plain-English comment before any code.
Struggle floor: 25 minutes unaided.
Hint ladder if needed: six rungs, one per ask.
Debrief: the five questions, in your commit message, before the next problem.

Reflect

Ten minutes in your notebook, in writing:

Explain it back: describe “binary search on the answer” to a peer who has only ever seen binary search on a sorted array, in four sentences (one per step of the recipe).
Connect: the classic search finds the first index at or above a target, and Koko finds the smallest feasible speed. In what sense are these the same search, just over a different space?
Monitor: which is still fuzzier for you, getting the boundaries right in the classic search, or proving a predicate is monotonic? Write the one sentence that would steady it.

Knowledge Check

Answer from memory first. Items marked ⟲ are spaced callbacks to earlier material.

State the two cues that should make you reach for binary search, and give a one-line example of each.
In a rotated sorted array, after computing mid, how do you decide which half is sorted, and why does that decision always work?
List the four steps of “binary search on the answer”, and say which step is the one people skip.
⟲ From Week 0: why is mid = (lo + hi) // 2 correct and (lo + hi) / 2 a bug, and what is the difference between bisect_left and bisect_right?
⟲ From Week 1: Top K Frequent Elements can be solved with bucket sort in O(n). Why would you not reach for binary search there, even though it sounds like “find the k largest”?

Answer key

1. Cue one: the input is sorted (or cheaply sortable) and you are looking something up, for example "find `target` in a sorted array" in O(log n). Cue two: there is a monotonic `feasible(x)` predicate over an answer space, for example "smallest eating speed that finishes the bananas in `h` hours", even though the piles are unsorted. 2. Compare `nums[mid]` to an endpoint, for example `nums[lo]`. If `nums[lo] <= nums[mid]`, the left half `[lo, mid]` is sorted; otherwise the right half `[mid, hi]` is sorted. It always works because a rotation has at most one "drop" point, so the drop falls in only one of the two halves, and the half without the drop is fully sorted. You then check whether the target lies within the sorted half's value range and move into the half that must contain it. 3. (1) Identify the answer space and its `lo`/`hi`. (2) Write `feasible(x)`, a yes-or-no test. (3) Confirm `feasible` is monotonic (flips exactly once) and say why. (4) Binary search the boundary (smallest or largest feasible `x`). Step 3 is the one people skip; asserting monotonicity without checking it is the usual source of a wrong answer. 4. `/` always returns a `float`, and `nums[2.5]` raises `TypeError`; `//` keeps `mid` an `int`, so `(lo + hi) // 2` is the only correct index form. `bisect_left(a, x)` returns the first index at or above `x` (before equal elements); `bisect_right(a, x)` returns the first index strictly past `x` (after equal elements). 5. Top K Frequent is a counting-then-bucketing problem: you tally frequencies in a `dict`, then bucket items by frequency into an array indexed by count, which is O(n). There is no sorted array and no monotonic `feasible(x)` to search; "k largest" here is about *ordering by count*, not about locating a threshold in an ordered space, so binary search does not fit and bucket sort beats it.

Cold revisit

This Friday, before any new work, re-solve two prior problems cold: no notes, no prior code open, no AI, the standard 25-minute floor on each. The first solve taught the pattern; this blind re-solve is what makes it stick. Full instructions are in revisit.md.

Top K Frequent Elements (Week 1, Arrays and Hashing). A blind re-solve. Do not look at which pattern it is until you have named it yourself; recognizing it cold is the point.
Daily Temperatures (Week 4, Stacks). A blind re-solve, same rules.

Neither is a binary-search problem, and that is deliberate: the cold revisit trains recognition across the whole course, not just this week’s pattern. In fact, noticing that neither one is binary search (Top K is counting plus bucketing, Daily Temperatures is a monotonic stack) is itself part of the exercise. Debrief each at the end and log the outcome in .tutor/revisit-log.md.

How to know you are done with this week

Pattern Card complete at .tutor/pattern-cards/binary-search.md, including the war-story field, and covering both cues.
Canonical deep-dive done: Koko Eating Bananas implemented yourself with an explicit feasible predicate, plus your short debrief (in your work repo).
Four practice problems solved, each with a five-question debrief in its commit message.
.tutor/revisit-log.md updated by the tutor with this week’s problems and the two cold revisits, with dates.
.tutor/progress.md updated to “Week 5 complete; ready for Week 6.”

If any of the above is missing, the week is not done. The tutor will not advance you to Week 6 until all five are present.

Resources

Free, no account required:

[Article] Binary search algorithm, the idea: https://en.wikipedia.org/wiki/Binary_search_algorithm
[Docs] Python bisect: https://docs.python.org/3/library/bisect.html
[Docs] Python math.ceil: https://docs.python.org/3/library/math.html#math.ceil