Loading content...
Every Sudoku placement must pass a gauntlet of constraints. Before we commit to placing a '7' in a cell, we must verify: Is there already a 7 in this row? In this column? In this 3×3 box? If any answer is 'yes', the placement is invalid. This seemingly simple check—repeated millions of times in a hard puzzle—is the guardian of correctness in our solver.
Validity checking is where the elegance of theory meets the demands of performance. A naïve implementation scans rows, columns, and boxes on every placement, performing 27 comparisons. A sophisticated implementation uses clever data structures to answer in O(1) time. The difference matters: in a solver exploring millions of nodes, shaving microseconds per check translates to seconds or minutes saved overall.
This page explores validity checking from first principles to optimized implementations. You'll understand: the three types of constraints that must be checked, multiple implementation strategies with different performance characteristics, data structures that enable O(1) validation, incremental checking during propagation, and common pitfalls that lead to incorrect solvers.
Sudoku's constraints are deceptively simple. Every complete valid Sudoku must satisfy three types of uniqueness constraints, all of which are instantiations of the same rule: no duplicates within a unit.
The three unit types:
This gives us 27 constraints total: 9 rows × 1 constraint + 9 columns × 1 constraint + 9 boxes × 1 constraint = 27.
When checking if a value can be placed at position (row, col), we must verify that the value doesn't already exist in:
123456789101112131415161718192021
0 1 2 3 4 5 6 7 8 ┌───────────┬───────────┬───────────┐ 0 │ . . . │ . . C │ . . . │ C = Column constraint 1 │ . . . │ . . C │ . . . │ 2 │ . . . │ . . C │ . . . │ ├───────────┼───────────┼───────────┤ 3 │ . . . │ B B X │ . . . │ B = Box constraint 4 │ R R R │ R R [*] │ R R R │ R = Row constraint 5 │ . . . │ B B X │ . . . │ [*] = Cell being checked ├───────────┼───────────┼───────────┤ X = Both Box and Column 6 │ . . . │ . . C │ . . . │ 7 │ . . . │ . . C │ . . . │ 8 │ . . . │ . . C │ . . . │ └───────────┴───────────┴───────────┘ Cell (4,5) is constrained by: - Row 4: 8 other cells (R) - Column 5: 8 other cells (C) - Box (3,3): 8 other cells (B), but 4 overlap with row/column Total unique constraining cells: 8 + 8 + 4 = 20 peersEvery cell has exactly 20 peer cells: 8 in its row, 8 in its column, and 4 additional cells in its box (the other 4 box cells are already counted in the row or column). When placing a value, we only need to verify the value doesn't appear among these 20 peers.
The most straightforward approach to validity checking examines the relevant cells directly from the board. While not optimal, this approach is important to understand as a baseline and for its clarity.
12345678910111213141516171819202122232425262728293031323334353637
def is_valid_placement_naive(board, row, col, value): """ Check if placing 'value' at position (row, col) is valid. Performs three separate scans: 1. Row scan: Check all cells in the same row 2. Column scan: Check all cells in the same column 3. Box scan: Check all cells in the same 3×3 box Time Complexity: O(9 + 9 + 9) = O(27) = O(1) per check Space Complexity: O(1) """ # Check row: does 'value' already exist in this row? for c in range(9): if c != col and board[row][c] == value: return False # Check column: does 'value' already exist in this column? for r in range(9): if r != row and board[r][col] == value: return False # Check 3×3 box: does 'value' already exist in this box? # First, find the top-left corner of the box box_row = (row // 3) * 3 # 0, 3, or 6 box_col = (col // 3) * 3 # 0, 3, or 6 for r in range(box_row, box_row + 3): for c in range(box_col, box_col + 3): if (r != row or c != col) and board[r][c] == value: return False return True # Passed all checks # Usage example:# if is_valid_placement_naive(board, 4, 5, 7):# board[4][5] = 7 # Safe to placeAnalysis of the naïve approach:
Time complexity: O(9 + 9 + 9) = O(27) per check. While this is technically O(1) for fixed-size Sudoku, the constant factor of 27 comparisons per placement matters when we make millions of placements.
Space complexity: O(1)—no additional data structures needed.
Pros:
Cons:
For educational purposes, prototyping, or very easy puzzles, the naïve approach is perfectly fine. Its clarity makes it excellent for understanding and debugging. However, for competitive or production solvers, we need better approaches.
The key insight for efficient validity checking is precomputation: maintain data structures that track which values are present in each row, column, and box. Checking validity then becomes a simple lookup rather than a scan.
The approach:
Maintain three sets of tracking structures:
row_used[r]: set of values already used in row rcol_used[c]: set of values already used in column cbox_used[b]: set of values already used in box bA placement is valid if and only if the value is not in any of the three relevant sets.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
class SudokuSolverWithSets: """ Sudoku solver using set-based validity checking. Achieves O(1) validity checks through auxiliary data structures. """ def __init__(self, puzzle): self.board = [row[:] for row in puzzle] # Tracking structures: which values are used in each unit self.row_used = [set() for _ in range(9)] self.col_used = [set() for _ in range(9)] self.box_used = [set() for _ in range(9)] # Initialize from initial puzzle state for r in range(9): for c in range(9): if self.board[r][c] != 0: value = self.board[r][c] self.row_used[r].add(value) self.col_used[c].add(value) self.box_used[self._box_index(r, c)].add(value) def _box_index(self, row, col): """Convert (row, col) to box index 0-8.""" return (row // 3) * 3 + (col // 3) def is_valid(self, row, col, value): """ O(1) validity check using precomputed sets. Simply check membership in three sets. """ box = self._box_index(row, col) return (value not in self.row_used[row] and value not in self.col_used[col] and value not in self.box_used[box]) def place(self, row, col, value): """Place a value and update tracking structures.""" self.board[row][col] = value box = self._box_index(row, col) self.row_used[row].add(value) self.col_used[col].add(value) self.box_used[box].add(value) def remove(self, row, col, value): """Remove a value (backtrack) and update tracking structures.""" self.board[row][col] = 0 box = self._box_index(row, col) self.row_used[row].discard(value) self.col_used[col].discard(value) self.box_used[box].discard(value)Analysis:
Time complexity:
is_valid(): O(1) — three set membership testsplace(): O(1) — three set insertionsremove(): O(1) — three set deletionsSpace complexity: O(27 × 9) = O(243) = O(1) for fixed Sudoku — 27 sets, each holding at most 9 values.
Correctness invariant:
The tracking structures must always accurately reflect the board state. This means:
place() must add the value to all three relevant setsremove() must remove the value from all three relevant setsA common bug is forgetting to update one of the three structures, leading to incorrect validity results.
The board and the tracking structures must always be synchronized. If you modify the board directly without updating the sets, validity checking will give wrong answers. Always use the place() and remove() methods, never board[r][c] = value directly.
For the fastest possible validity checking, we can replace sets with bitmasks. A 16-bit integer can represent the presence/absence of all 9 Sudoku digits (using bits 1-9, ignoring bit 0). This approach is:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
class SudokuSolverBitset: """ Ultra-fast Sudoku solver using bitset-based validity checking. Each row/column/box is represented as a 16-bit bitmask. Bit i is set if digit i is present (we use bits 1-9). """ def __init__(self, puzzle): self.board = [row[:] for row in puzzle] # Bitmasks: bit i is set if digit i is used self.row_mask = [0] * 9 self.col_mask = [0] * 9 self.box_mask = [0] * 9 # Initialize from puzzle for r in range(9): for c in range(9): if self.board[r][c] != 0: self._set_bit(r, c, self.board[r][c]) def _box_index(self, row, col): return (row // 3) * 3 + (col // 3) def _set_bit(self, row, col, value): """Set the bit for 'value' in row, column, and box masks.""" bit = 1 << value # Bit position for this value self.row_mask[row] |= bit self.col_mask[col] |= bit self.box_mask[self._box_index(row, col)] |= bit def _clear_bit(self, row, col, value): """Clear the bit for 'value' in row, column, and box masks.""" bit = ~(1 << value) # Inverted bit mask self.row_mask[row] &= bit self.col_mask[col] &= bit self.box_mask[self._box_index(row, col)] &= bit def is_valid(self, row, col, value): """ O(1) validity check using bitwise operations. Check if the bit for 'value' is set in any of the three masks. If any bit is set, the value is already used (invalid). """ bit = 1 << value box = self._box_index(row, col) # Combine masks with OR, check if value's bit is set combined = self.row_mask[row] | self.col_mask[col] | self.box_mask[box] return (combined & bit) == 0 def get_valid_values(self, row, col): """ Return all valid values for a cell as a bitmask. Extremely efficient: single bitwise computation. """ box = self._box_index(row, col) # Used values in row, column, and box used = self.row_mask[row] | self.col_mask[col] | self.box_mask[box] # Valid values are those NOT used (complement, masked to bits 1-9) valid_mask = (~used) & 0b1111111110 # Bits 1-9 return valid_mask def count_valid_values(self, row, col): """Count valid values using population count (popcount).""" valid_mask = self.get_valid_values(row, col) return bin(valid_mask).count('1') def place(self, row, col, value): self.board[row][col] = value self._set_bit(row, col, value) def remove(self, row, col, value): self.board[row][col] = 0 self._clear_bit(row, col, value)| Method | Time per Check | Memory | Implementation Complexity |
|---|---|---|---|
| Naïve Scan | ~27 comparisons | 0 extra | Very simple |
| Set-Based | 3 hash lookups | ~243 set entries | Simple |
| Bitset-Based | 3 bitwise ops | 27 integers | Moderate |
| Bitset Combined | 1-2 bitwise ops | 27 integers | Moderate |
Modern CPUs have a native POPCNT instruction that counts set bits in a single cycle. In high-performance solvers, count_valid_values() becomes a single CPU instruction, making MRV computation extremely fast. Python's bin(x).count('1') is slower; use gmpy2.popcount() or similar for production code.
So far, we've discussed checking validity before placement. An alternative paradigm is incremental checking: maintain invariants that guarantee the board is always valid by only allowing valid placements. This subtly changes how we structure the solver.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697
class SudokuSolverDomains: """ Solver using domain-based incremental validity. Key insight: Instead of checking validity at placement time, we maintain domains such that every value in a domain is guaranteed valid. Validity is enforced through propagation. """ def __init__(self, puzzle): self.board = [row[:] for row in puzzle] # Each cell's domain: set of valid values # Initialized to {1..9}, then constrained by given values self.domains = [[set(range(1, 10)) for _ in range(9)] for _ in range(9)] # Initialize: propagate all given clues for r in range(9): for c in range(9): if puzzle[r][c] != 0: value = puzzle[r][c] self.domains[r][c] = {value} # Singleton domain self._propagate_constraint(r, c, value) def _propagate_constraint(self, row, col, value): """ Remove 'value' from domains of all peers. Called after placing 'value' at (row, col). """ peers = self._get_peers(row, col) for peer_r, peer_c in peers: self.domains[peer_r][peer_c].discard(value) def _get_peers(self, row, col): """Get all 20 peers of (row, col).""" peers = [] # Row peers for c in range(9): if c != col: peers.append((row, c)) # Column peers for r in range(9): if r != row: peers.append((r, col)) # Box peers (not already in row or column) box_r, box_c = (row // 3) * 3, (col // 3) * 3 for r in range(box_r, box_r + 3): for c in range(box_c, box_c + 3): if r != row and c != col: peers.append((r, c)) return peers def solve(self): # Find cell with smallest non-singleton domain best_cell = None min_size = 10 for r in range(9): for c in range(9): if self.board[r][c] == 0: size = len(self.domains[r][c]) if size == 0: return False # Domain empty = contradiction if size < min_size: min_size = size best_cell = (r, c) if best_cell is None: return True # All cells filled = solved row, col = best_cell # Try each value in domain - ALL are guaranteed valid! for value in list(self.domains[row][col]): # Save state saved_board = [r[:] for r in self.board] saved_domains = [[d.copy() for d in row] for row in self.domains] # Place and propagate self.board[row][col] = value self.domains[row][col] = {value} self._propagate_constraint(row, col, value) # Recurse - note: no validity check needed! if self.solve(): return True # Backtrack: restore state self.board = saved_board self.domains = saved_domains return FalseTrade-offs of incremental checking:
Pros:
Cons:
When to use which:
For simple solvers, explicit validity checks (especially bitset-based) are straightforward and fast. For sophisticated solvers with extensive constraint propagation, domain-based approaches are more natural because propagation maintains domains anyway—validity comes 'for free' as a byproduct.
Sometimes we need to validate an entire board—perhaps to check a proposed solution, verify initial puzzle validity, or validate user input. This requires checking all 27 constraints comprehensively.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091
def validate_sudoku_complete(board): """ Validate a complete Sudoku board. Returns (is_valid, error_message) tuple. Checks: 1. All cells contain values 1-9 (completeness) 2. Each row contains 1-9 exactly once 3. Each column contains 1-9 exactly once 4. Each 3×3 box contains 1-9 exactly once """ # Check cell values for r in range(9): for c in range(9): if board[r][c] not in range(1, 10): return False, f"Invalid value {board[r][c]} at ({r}, {c})" # Check rows for r in range(9): seen = set() for c in range(9): val = board[r][c] if val in seen: return False, f"Duplicate {val} in row {r}" seen.add(val) # Check columns for c in range(9): seen = set() for r in range(9): val = board[r][c] if val in seen: return False, f"Duplicate {val} in column {c}" seen.add(val) # Check boxes for box_r in range(3): for box_c in range(3): seen = set() for r in range(box_r * 3, box_r * 3 + 3): for c in range(box_c * 3, box_c * 3 + 3): val = board[r][c] if val in seen: return False, f"Duplicate {val} in box ({box_r}, {box_c})" seen.add(val) return True, "Valid complete Sudoku" def validate_sudoku_partial(board): """ Validate a partially filled Sudoku board. Empty cells (0) are allowed but filled cells must not violate constraints. """ # Check rows (ignoring zeros) for r in range(9): seen = set() for c in range(9): val = board[r][c] if val == 0: continue if val not in range(1, 10): return False, f"Invalid value {val} at ({r}, {c})" if val in seen: return False, f"Duplicate {val} in row {r}" seen.add(val) # Check columns (ignoring zeros) for c in range(9): seen = set() for r in range(9): val = board[r][c] if val == 0: continue if val in seen: return False, f"Duplicate {val} in column {c}" seen.add(val) # Check boxes (ignoring zeros) for box_r in range(3): for box_c in range(3): seen = set() for r in range(box_r * 3, box_r * 3 + 3): for c in range(box_c * 3, box_c * 3 + 3): val = board[r][c] if val == 0: continue if val in seen: return False, f"Duplicate {val} in box ({box_r}, {box_c})" seen.add(val) return True, "Valid partial Sudoku"Validation tells you if a configuration is valid—it doesn't tell you if it's solvable or has a unique solution. A valid partial board might have zero solutions (painted into a corner) or multiple solutions (underspecified). These are separate questions requiring different analysis.
Validity checking seems simple, but subtle bugs can cause incorrect solver behavior. Here are the most common pitfalls and how to avoid them:
1234567891011121314151617181920212223242526272829303132333435363738394041424344
def test_validity_checker(): """ Comprehensive tests for validity checking implementation. Run these before trusting your solver! """ # Empty board: everything should be valid empty = [[0] * 9 for _ in range(9)] for r in range(9): for c in range(9): for v in range(1, 10): assert is_valid(empty, r, c, v), f"Empty board: ({r},{c})={v} should be valid" # Single value: only same row/col/box should be affected board = [[0] * 9 for _ in range(9)] board[4][4] = 5 # Center cell # Same value in same row should be invalid assert not is_valid(board, 4, 0, 5), "Same row should be invalid" assert not is_valid(board, 4, 8, 5), "Same row should be invalid" # Same value in same column should be invalid assert not is_valid(board, 0, 4, 5), "Same column should be invalid" assert not is_valid(board, 8, 4, 5), "Same column should be invalid" # Same value in same box should be invalid assert not is_valid(board, 3, 3, 5), "Same box should be invalid" assert not is_valid(board, 5, 5, 5), "Same box should be invalid" # Different value should be valid (no conflict) assert is_valid(board, 4, 0, 7), "Different value should be valid" # Outside row/col/box with same value should be valid assert is_valid(board, 0, 0, 5), "Outside scope should be valid" assert is_valid(board, 8, 8, 5), "Outside scope should be valid" # Box boundary tests board2 = [[0] * 9 for _ in range(9)] board2[2][2] = 9 # Bottom-right of top-left box assert not is_valid(board2, 0, 0, 9), "Top-left of same box should be invalid" assert is_valid(board2, 3, 0, 9), "Different box (below) should be valid" assert is_valid(board2, 0, 3, 9), "Different box (right) should be valid" print("All validity tests passed!")A bug in validity checking can cause your solver to either accept invalid placements (producing wrong 'solutions') or reject valid placements (failing to find solutions that exist). Always run comprehensive tests before using your solver for anything important.
We've explored validity checking from basic concepts to highly optimized implementations. Let's consolidate the key insights:
With constraint satisfaction, cell-by-cell exploration, and validity checking mastered, we have a working Sudoku solver. The final page focuses on optimization strategies—techniques that transform a working solver into a lightning-fast one, capable of solving even the hardest puzzles in milliseconds.
The role of validity checking:
Think of validity checking as the immune system of your solver. It prevents invalid configurations from propagating, ensuring that every partial solution explored is potentially completable. Without robust validity checking, the search would waste time exploring impossible paths. With it, every step moves purposefully toward a solution.