Generated by Codex with GPT-5
Quick facts
- Difficulty:
MEDIUM - Problem: Valid Sudoku
- Main tags:
Array,Hash Table,Matrix
What the problem is really asking
This problem is not asking to solve the Sudoku board.
It is only asking whether the current partially filled board is still legal. That means every filled digit must satisfy three rules at the same time:
- it has not already appeared in the same row
- it has not already appeared in the same column
- it has not already appeared in the same
3 x 3box
Empty cells, written as ., do not matter.
So the real task is to scan the board once and answer one question for each digit:
“Have I already seen this digit in any place that would make it illegal here?”
The cleanest way to derive the solution
Start from the most direct interpretation of the rules.
For every filled cell, the algorithm needs to remember which digits have already been seen in:
- each row
- each column
- each
3 x 3box
That immediately suggests three tracking structures:
rows[row_index]columns[column_index]boxes[box_index]
Each one stores the digits that have appeared so far.
Then the scan becomes simple:
- skip
.cells - compute which row, column, and box the digit belongs to
- if the digit is already present in any of those trackers, the board is invalid
- otherwise record it in all three places and continue
That is already the optimal interview solution in asymptotic terms. The board is visited once, and each lookup or insert is constant time on average.
Best approaches
The most practical solution uses three arrays of sets. It is easy to explain, easy to
code correctly, and runs in O(n^2) time for an n x n board.
A slightly more optimized variant replaces sets with bitmasks. The idea is the same:
instead of storing "5" in a set, it turns on bit 5 inside an integer mask for the
row, column, and box. That saves some constant-factor overhead, but it is a little less
readable. In most interviews, the set-based solution is the better default unless the
interviewer explicitly asks for tighter constant factors.
Why the box index formula works
The only part that can feel non-obvious is mapping a cell to one of the nine 3 x 3
boxes.
The trick is to group rows and columns in chunks of three:
- rows
0-2belong to the top band - rows
3-5belong to the middle band - rows
6-8belong to the bottom band
The same idea applies to columns. So the box number is:
(row_index // 3) * 3 + (column_index // 3)
That converts a two-dimensional box position into one index from 0 to 8.
Python solution
from typing import List, Set
class Solution:
BOARD_SIZE = 9
BOX_SIZE = 3
EMPTY_CELL = "."
def isValidSudoku(self, board: List[List[str]]) -> bool:
rows_seen: List[Set[str]] = [set() for _ in range(self.BOARD_SIZE)]
columns_seen: List[Set[str]] = [set() for _ in range(self.BOARD_SIZE)]
boxes_seen: List[Set[str]] = [set() for _ in range(self.BOARD_SIZE)]
for row_index in range(self.BOARD_SIZE):
for column_index in range(self.BOARD_SIZE):
value = board[row_index][column_index]
if value == self.EMPTY_CELL:
continue
box_index = self._box_index(
row_index=row_index,
column_index=column_index,
)
if self._creates_duplicate(
value=value,
row_seen=rows_seen[row_index],
column_seen=columns_seen[column_index],
box_seen=boxes_seen[box_index],
):
return False
rows_seen[row_index].add(value)
columns_seen[column_index].add(value)
boxes_seen[box_index].add(value)
return True
def _box_index(self, row_index: int, column_index: int) -> int:
return (row_index // self.BOX_SIZE) * self.BOX_SIZE + (
column_index // self.BOX_SIZE
)
@staticmethod
def _creates_duplicate(
value: str,
row_seen: Set[str],
column_seen: Set[str],
box_seen: Set[str],
) -> bool:
return (
value in row_seen
or value in column_seen
or value in box_seen
)This implementation keeps the logic very direct: every digit is checked against the three Sudoku constraints exactly once, and the helper methods isolate the only two pieces of reusable logic.
For the fixed 9 x 9 LeetCode board, both time and space are effectively constant. If
the board is treated as a generalized n x n board, the algorithm is O(n^2) time and
O(n^2) space.
Interview follow-ups
Can this be done with less memory?
Yes. The standard follow-up is to replace the sets with integer bitmasks. Each row,
column, and box keeps one integer where bit 1 means digit 1 has been seen, bit 2
means digit 2 has been seen, and so on. For each cell, the algorithm computes the
digit’s bit and checks whether that bit is already set in the row, column, or box mask.
If it is, the board is invalid; otherwise the bit is turned on in all three masks.
Why it works: it is exactly the same bookkeeping as the set-based solution, just stored
more compactly. Membership testing becomes a bitwise AND, and insertion becomes a
bitwise OR.
Complexity tradeoff: the asymptotic complexity stays the same, but the constant factors improve. The downside is readability. For most interviews, sets are easier to explain and less error-prone, while bitmasks are a good optimization follow-up.
What if the board were updated many times and each update needed to be validated quickly?
In that version, it is better to keep persistent row, column, and box state instead of rebuilding it from scratch after every edit. An initialization pass can build those trackers once. Then when a cell changes, the algorithm removes the old digit from its row, column, and box counts, checks whether the new digit would conflict, and inserts it if the move is legal.
Why it works: the Sudoku constraints are local to one row, one column, and one box. When one cell changes, only those three structures need to be updated.
Complexity tradeoff: the one-time setup still costs O(n^2), but each later edit can be
validated in O(1) time. The tradeoff is more state to maintain carefully, especially
when removing digits or supporting undo operations.
What if the interviewer generalizes the puzzle to an n x n board instead of 9 x 9?
The same strategy still works as long as the board is divided into equal square boxes.
Instead of hardcoding 9 and 3, compute board_size = len(board) and
box_size = int(sqrt(board_size)). Then build row, column, and box trackers sized to
that board. The box index formula stays the same shape:
(row_index // box_size) * box_size + (column_index // box_size)
Why it works: the legality rule has not changed. A symbol is still invalid exactly when it repeats within its row, column, or box, so the same tracking idea scales naturally.
Complexity tradeoff: for a generalized board_size x board_size board, validation is
O(board_size^2) time. Space is also O(board_size^2) if trackers are stored
explicitly. The main extra care is validating that the box structure is actually square
and compatible with the board dimensions.
Takeaway
The shortest path to the answer is to stop thinking about Sudoku as a puzzle and start thinking about it as a duplicate-detection problem across three overlapping views of the same cell: row, column, and box.
Once that framing is clear, the solution becomes a straightforward single pass.