Summary
Many problems can be solved with trial and error. Computers are good at this drudgery, but they are limited by their own power. We will explore ways to use recursion to reduce large problems to many smaller ones.
The Mixtape
The cassette tape democratized music by letting anyone record/copy music with a small device. A popular use was to make a mixtape, copying songs from many sources. However, each tape had a limited capacity, so we want to have as much music as possible. Our goal is to choose a set that comes closest to filling a tape.
Our naïve approach might look as follows:
song = list(songs.keys())[0] # pick an arbitrary song
duration = songs[song]
return {song} | mixtape(
{k: v for k, v in songs.items() if k != song},
target_duration - duration,
)This is a recursive set of songs with a set for the subproblem of filling the remaining duration. To make the dict of songs remaining, we used a dictionary comprehension. Since target_duration is decreasing, our natural base case might be:
if target_duration == 0:
return set()However, this code fails, since songs is empty once we reach the base case. Hence, we add:
if not songs:
return NoneWe now also need to change our recursive case, since we no longer act under the assumption that our recursive result is a set:
recursive_result = mixtape(
{k: v for k, v in songs.items() if k != song},
target_duration - duration,
)
if recursive_result:
return {song} | recursive_result
else:
return NoneThis code doesn’t explore all possibilities. Our recursive call always assumes the current song is part of the solution, so our only possible solutions include the first song. We need to backtrack, or consider both the case that the song is and isn’t part of the solution.
if target_duration == 0:
return set()
if not songs:
return None
song = list(songs.keys())[0]
songs_rest = {k: v for k, v in songs.items() if k != song}
duration = songs[song]
# if the first song is part of a solution, we can fill the remaining
# space on the tape with the other songs
recursive_result1 = mixtape(songs_rest, target_duration - duration)
if recursive_result1 is not None:
return {song} | recursive_result1
# if the first song is not part of the solution, we should try to fill
# the whole duration with the other songs
recursive_result2 = mixtape(songs_rest, target_duration)
if recursive_result2 is not None:
return recursive_result2
# if there is no solution with the first song, and no solution without
# it, then there must be no solution. it's only now that we can safely
# return None to indicate failure.
return NoneAn Alternative Approach
Notice we could also mix recursion and iteration. Rather than recursing on the “exclude song” case, we can just iterate over which song to include.
We could also create a flood-fill-esque approach where we accumulate in the set so_far as we go:
def mixtape(songs, target_duration, so_far=frozenset()):
so_far_duration = sum(songs[s] for s in so_far)
if so_far_duration == target_duration:
return _____ # < -- what should go here?
if so_far_duration > target_duration:
return None
for song, duration in songs.items():
recursive_result = mixtape(songs, target_duration, so_far | {song})
if recursive_result is not None:
return recursive_result
return NoneAside, Default Arguments
In this version, so_far is a default argument. That means if the caller only passes songs and target_duration, Python automatically uses the default value for so_far. Here, that means the original call starts with an empty frozenset(), while the recursive calls pass an updated value explicitly.
This is a useful pattern in recursion. It lets us keep helper state in the function signature without forcing the original caller to provide it. The caller just asks for a mixtape of a certain duration, and the recursive process manages so_far internally.
There is one Python-specific trap here: default argument values are created once, when the function is defined, not each time the function is called. If that default value is mutable and the function changes it, later calls can accidentally share state from earlier ones.
That is why this example uses frozenset() instead of set(). Even though this code is building new sets with so_far | {song} rather than mutating so_far directly, using an immutable default is still the safer habit.
Sudoku
https://py.mit.edu/_static/spring26/readings/backtracking/sudoku.svg
We will try to solve sudoku as follows:
- Find a
0in the board. - Try filling it with one of the digits
1through9. - Recursively try to solve that updated board. If it works, we are done.
- If it does not work, try the next digit.
- If none of the digits work, then this path fails, so we return
None.
Here is code that follows that strategy:
def solve_sudoku(board):
"""
Given a sudoku board (as a list-of-lists of numbers, where 0
represents an empty square), return a solved version of the puzzle.
"""
for row in range(9):
for col in range(9):
if board[row][col] != 0:
continue
# Found a 0; try filling it
for trial in range(1, 10):
new_board = [[trial if (r, c) == (row, col) else board[r][c]
for c in range(9)]
for r in range(9)]
if valid_board(new_board):
result = solve_sudoku(new_board)
if result is not None:
return result
return None
return boardThis works, but it is a bit wasteful. When we find an empty square, we try every digit and only later ask valid_board whether that choice was obviously impossible. A person solving sudoku usually does the opposite: look at the row, column, and subgrid first, rule out impossible digits, and only then try the remaining ones.
To do that, it helps to write a few utility functions:
values_in_row(board, r)should return the nonzero values already present in rowr.values_in_column(board, c)should return the nonzero values already present in columnc.values_in_subgrid(board, sr, sc)should return the nonzero values already present in the3 times 3subgrid at subgrid rowsrand subgrid columnsc.
One possible implementation is:
def values_in_row(board, r):
return {value for value in board[r] if value != 0}
def values_in_column(board, c):
return {board[r][c] for r in range(9) if board[r][c] != 0}
def values_in_subgrid(board, sr, sc):
start_row = 3 * sr
start_col = 3 * sc
return {
board[r][c]
for r in range(start_row, start_row + 3)
for c in range(start_col, start_col + 3)
if board[r][c] != 0
}These helper functions let us narrow the search before we recurse. That is the same general idea as base cases in other recursive problems: use the structure of the problem to avoid unnecessary work as early as possible.
Improving Efficiency, Mutation
We can push this a bit further. So far, every recursive call has built a fresh board. That is safer, but it also means we keep copying nested lists. Another option is to reuse the same board and mutate it in place.
That looks like this:
def solve_sudoku(board):
"""
Given a sudoku board (as a list-of-lists of numbers, where 0
represents an empty square), return a solved version of the puzzle.
"""
for row in range(9):
for col in range(9):
if board[row][col] != 0:
continue
# Found a 0; try filling it
for trial in valid_moves(board, row, col):
board[row][col] = trial
result = solve_sudoku(board)
if result is not None:
return result
board[row][col] = 0
return None
return boardThis is very similar to the version above, but there is one critical new line:
board[row][col] = 0That line undoes the mutation if a recursive path fails. Without it, later recursive calls would keep seeing the failed trial value and would act as if that number had always belonged in that square. Once we start mutating shared state, we also need to explicitly un-mutate it when we backtrack.
This is faster, but it is also riskier. Reusing the same board avoids repeated copying, but mutation makes aliasing bugs much easier to introduce and much harder to debug. In many cases, the simpler immutable version is still the better tradeoff.
For these examples, though, mutating the board does give another noticeable speedup:
| board | original code | only valid moves | mutate instead of copy |
|---|---|---|---|
| grid1 | 0.125 | 0.014 | 0.004 |
| grid2 | 1.297 | 0.123 | 0.044 |
| grid3 | 1.545 | 0.149 | 0.050 |