LeetCode HARD 37 Sudoku Solver Summary

Generated by Codex with GPT-5

Quick facts

Difficulty: HARD
Problem: Sudoku Solver
Main tags: Array, Hash Table, Backtracking, Matrix

What the problem is really asking

This problem is a constraint satisfaction problem disguised as a puzzle.

The board is almost complete already. The task is to fill each . with a digit so that three rules are always true:

each row contains 1 through 9 exactly once
each column contains 1 through 9 exactly once
each 3 x 3 box contains 1 through 9 exactly once

So the core challenge is not “try numbers until something works.” The real challenge is to keep track of which digits are still legal at each empty cell and then search in a way that cuts off bad branches quickly.

Why naive brute force wastes work

A direct brute-force solution would look at the first empty cell, try digits 1 to 9, scan the row, column, and box each time to see if the placement is legal, then recurse.

That is correct, but it repeats the same checks again and again. A digit that is obviously illegal in a row gets rediscovered as illegal every time that cell is revisited. Once several empty cells are involved, the search tree becomes enormous.

That repeated work suggests two improvements:

Keep row, column, and box constraints in dedicated data structures so legality checks are constant time.
Do not pick the next empty cell arbitrarily. Pick the one with the fewest legal digits, because that is where a contradiction will show up fastest.

The practical optimal approach

The best practical interview solution is backtracking with constraint tracking.

Maintain three collections:

row_digits[row]: digits already used in that row
column_digits[col]: digits already used in that column
box_digits[box]: digits already used in that 3 x 3 box

Then repeatedly do this:

Find the empty cell with the smallest candidate set.
Try each legal digit for that cell.
Update the row, column, and box state immediately.
Recurse.
If the branch fails, undo the move and try the next digit.

That “most constrained cell first” rule is the key optimization. If a cell has zero candidates, the branch is impossible right away. If a cell has one candidate, the search is almost forced. Both cases reduce the branching factor dramatically.

How to derive the solution in an interview

A clean interview derivation usually goes like this:

Start with plain backtracking, because the board has to be completed one empty cell at a time.
Notice that the expensive part is repeatedly checking whether a digit is legal.
Replace repeated scans with row, column, and box state so “can I place this digit here?” becomes a quick lookup.
Notice that choosing the next empty cell arbitrarily is wasteful, because some cells are much more constrained than others.
Add the heuristic “solve the hardest empty cell first,” which usually prunes the tree far earlier.

That gives a solution that is still exponential in the worst case, but is fast enough in practice for standard Sudoku inputs and is the expected approach for this problem.

Why the algorithm is correct

The algorithm maintains an important invariant: every recursive call starts from a board state that already satisfies all Sudoku constraints for every filled cell.

When the solver places a digit, it only chooses from the legal candidates for that specific row, column, and box. So the invariant stays true after each placement. If a branch reaches a state where some empty cell has no legal candidates, that branch cannot lead to a valid solution and can be abandoned safely.

Because the search tries every legal candidate whenever there is a genuine choice, it explores all valid completions that extend the current partial board. Therefore, if a solution exists, the search will eventually find it. If the recursive call returns success, the board is fully filled and still satisfies every constraint, so it is a correct Sudoku solution.

Best approach

For LeetCode, the strongest general answer is:

Backtracking
Constraint tracking for rows, columns, and boxes
A “minimum remaining values” choice for the next empty cell

Complexity is still exponential in the number of empty cells, often written as O(9^E) in the worst case for E empty cells. The important point is that the bookkeeping and cell-ordering heuristic prune the search so aggressively that the solver is practical. Extra space is O(E) for recursion and empty-cell bookkeeping, plus the fixed row, column, and box sets.

Python solution

The implementation below solves the board in place. It keeps explicit row, column, and box state, and it chooses the next empty cell with the fewest candidates so contradictions are detected early.

from typing import List, Optional, Set, Tuple


Cell = Tuple[int, int]


class SudokuBacktrackingSolver:
    """Solve a standard 9x9 Sudoku board in place."""

    ALL_DIGITS = tuple("123456789")

    def __init__(self, board: List[List[str]]) -> None:
        self.board = board
        self.row_digits: List[Set[str]] = [set() for _ in range(9)]
        self.column_digits: List[Set[str]] = [set() for _ in range(9)]
        self.box_digits: List[Set[str]] = [set() for _ in range(9)]
        self.empty_cells: List[Cell] = []
        self._index_starting_board()

    def _box_index(self, row: int, col: int) -> int:
        return (row // 3) * 3 + (col // 3)

    def _index_starting_board(self) -> None:
        """Load the initial constraints and remember every empty cell."""
        for row in range(9):
            for col in range(9):
                value = self.board[row][col]
                if value == ".":
                    self.empty_cells.append((row, col))
                    continue

                self._place_digit(row, col, value)

    def _place_digit(self, row: int, col: int, digit: str) -> None:
        """Place one digit and update all constraint trackers."""
        box_index = self._box_index(row, col)

        if (
            digit in self.row_digits[row]
            or digit in self.column_digits[col]
            or digit in self.box_digits[box_index]
        ):
            raise ValueError("Board contains conflicting digits.")

        self.board[row][col] = digit
        self.row_digits[row].add(digit)
        self.column_digits[col].add(digit)
        self.box_digits[box_index].add(digit)

    def _remove_digit(self, row: int, col: int, digit: str) -> None:
        """Undo one placement during backtracking."""
        box_index = self._box_index(row, col)
        self.board[row][col] = "."
        self.row_digits[row].remove(digit)
        self.column_digits[col].remove(digit)
        self.box_digits[box_index].remove(digit)

    def _candidates_for(self, row: int, col: int) -> List[str]:
        """Return every digit that is still legal for this cell."""
        used_digits = (
            self.row_digits[row]
            | self.column_digits[col]
            | self.box_digits[self._box_index(row, col)]
        )
        return [digit for digit in self.ALL_DIGITS if digit not in used_digits]

    def _choose_next_cell(self) -> Optional[Tuple[int, int, List[str]]]:
        """
        Pick the empty cell with the fewest legal candidates.
        Returning an empty candidate list lets the caller fail fast.
        """
        best_choice: Optional[Tuple[int, int, List[str]]] = None

        for row, col in self.empty_cells:
            if self.board[row][col] != ".":
                continue

            candidates = self._candidates_for(row, col)
            if not candidates:
                return row, col, []

            if best_choice is None or len(candidates) < len(best_choice[2]):
                best_choice = (row, col, candidates)
                if len(candidates) == 1:
                    break

        return best_choice

    def solve(self) -> bool:
        """Return True once the board has been solved in place."""
        next_choice = self._choose_next_cell()
        if next_choice is None:
            return True

        row, col, candidates = next_choice
        for digit in candidates:
            self._place_digit(row, col, digit)
            if self.solve():
                return True
            self._remove_digit(row, col, digit)

        return False


class Solution:
    def solveSudoku(self, board: List[List[str]]) -> None:
        solver = SudokuBacktrackingSolver(board)
        if not solver.solve():
            raise ValueError("Sudoku board has no valid solution.")

Interview follow-ups

How would you make this faster if raw performance mattered more than readability?

A common upgrade is to replace Python set objects with bitmasks. Each row, column, and box can store a 9-bit integer where bit k says whether digit k + 1 is already used. Candidate computation becomes a few bitwise operations, which reduces constant factors substantially. The search strategy does not change: it is still backtracking with constraint tracking and usually the same “fewest candidates first” heuristic. The tradeoff is readability. Bitmask code is faster but harder to derive and explain under interview pressure, so the set-based version is often the better first answer unless the interviewer explicitly pushes for optimization.

What if the puzzle might have no solution or more than one solution?

To detect that, the solver should not stop at the first valid completion. Instead, it should continue the search and count how many full solutions exist, usually with an early exit once the count exceeds 1. If the count is 0, the puzzle is unsatisfiable. If the count is 1, the solution is unique. If the count is greater than 1, the puzzle is ambiguous. The backtracking framework stays the same, but the recursion returns a count or updates a shared counter instead of returning immediately on the first success. The tradeoff is extra search time, because uniqueness checking explores more of the tree than “find any one solution.”

How would you generalize this to an `n^2 x n^2` Sudoku board?

The same structure still works. Rows, columns, and sub-boxes still define the constraints, so the algorithm still tracks which symbols are already used and still fills the board with backtracking. The main changes are mechanical: the digit alphabet becomes a larger symbol set, the board size becomes n^2, and the box index formula uses n instead of 3. Complexity grows quickly as the board gets larger because both the branching factor and the depth of the search increase. So the approach generalizes cleanly, but performance pressure becomes stronger, which makes heuristics and compact constraint representations even more valuable.

Why is choosing the most constrained empty cell better than scanning left to right?

Because it exposes failure earlier. If the next empty cell has only one legal digit, the branch is nearly forced. If it has zero legal digits, the branch is impossible and can be abandoned immediately. A left-to-right search may spend a lot of time filling easy cells first and only discover the contradiction much later. The most-constrained-cell heuristic does not change the worst-case exponential nature of the problem, but in practice it reduces the search tree dramatically, which is exactly the kind of improvement interviewers want to see on a hard backtracking question.