Data Structures & AlgorithmsFrom Solution to Clean Code

From Solution to Clean Code

LevelAdvanced

Duration60 mins

TopicFrom Solution to Clean Code

4 / 5

Commenting Non-Obvious Logic

The Paradox of Comments

Comments in code exist in a paradox. Too many comments clutter the code, repeat what's already evident, and become lies when the code changes but comments don't. Too few comments leave readers bewildered by clever tricks, subtle edge case handling, and mathematical derivations that took you hours to figure out.

The real question isn't whether to comment, but what to comment.

Algorithmic code is especially susceptible to this paradox. A single line implementing a bit manipulation trick might embody thirty minutes of whiteboard derivation. A boundary condition check might prevent an off-by-one error that took three debugging sessions to discover. Without comments, these insights are lost. With the wrong comments, you're just adding noise.

This page teaches you to comment strategically—adding value where clarity is needed while trusting your code (and naming) to speak for itself elsewhere.

What You Will Learn

By the end of this page, you will understand the distinction between what and why comments, master the types of algorithmic logic that demand documentation, and develop the judgment to add comments that future readers (including yourself) will thank you for.

The Cardinal Rule: Comment Why, Not What

The most important principle in commenting is this: Code tells you what is happening; comments should tell you why.

Comments that describe what the code does are usually redundant—that information is already in the code. Comments that explain why the code does something provide information that cannot be derived from the code alone.

What comments (usually bad):

1
2
3
4
5
6
7
8
9
# Bad: Repeats what code clearly says
count = 0  # Initialize count to zero
count += 1  # Increment count by one
return result  # Return the result
 
# Bad: Describes the obvious
for item in items:  # Loop through items
    if item > threshold:  # If item is greater than threshold
        selected.append(item)  # Add item to selected list

Why comments (valuable):

1
2
3
4
5
6
7
8
9
10
11
# Good: Explains WHY this specific value
count = 0  # Start from scratch; previous counts are invalidated by buffer flush
 
# Good: Explains the reasoning
count += 1  # Account for the implicit root node not in the tree
 
# Good: Explains the non-obvious purpose
for item in items:
    # Filter early to avoid O(n²) blowup in downstream processing
    if item > threshold:
        selected.append(item)

The Information Test

Before writing a comment, ask: 'Does this comment contain information not present in the code?' If you could delete the comment and still understand everything it said by reading the code, the comment is noise. If the comment reveals intention, history, or rationale not in the code, it's valuable.

The exception: Comments that serve as section headers or navigation aids can describe what while still adding value by structuring the reading experience.

1
2
3
4
5
6
7
8
9
def complex_algorithm(data):
    # ===== PHASE 1: PREPROCESSING =====
    # ... preprocessing code ...
    
    # ===== PHASE 2: CORE COMPUTATION =====
    # ... main algorithm ...
    
    # ===== PHASE 3: RESULT EXTRACTION =====
    # ... extract results ...

Types of Logic That Demand Comments

Certain patterns in algorithmic code are inherently non-obvious. Learning to recognize these patterns helps you know when comments are essential.

Type 1: Non-obvious boundary conditions

Boundary conditions in algorithms often seem arbitrary without explanation:

1
2
3
4
5
6
7
8
# Without comment: Why n-1? Why not n?
for i in range(n - 1):
    relax_all_edges(graph)
 
# With comment: Purpose is clear
# Bellman-Ford: n-1 iterations sufficient because shortest path has at most n-1 edges
for i in range(n - 1):
    relax_all_edges(graph)

1
2
3
4
5
6
7
# Without comment: Mysterious +1
table_size = 2 * n + 1
 
# With comment: Reasoning documented
# Segment tree needs 2*n-1 nodes, but we round up to next power of 2
# Adding 1 accounts for 1-based indexing
table_size = 2 * n + 1

Type 2: Clever bit manipulation tricks

Bit manipulation often relies on properties that aren't self-evident:

1
2
3
4
5
6
7
# Without comment: Magic incantation
n = n & (n - 1)
 
# With comment: Trick explained
# Clear the rightmost set bit: subtracting 1 flips all bits up to and 
# including the rightmost 1, ANDing removes exactly that bit
n = n & (n - 1)

1
2
3
4
5
6
7
8
9
# Without comment: What is this checking?
if n & (n - 1) == 0 and n != 0:
    return True
 
# With comment: Property documented
# Power of 2 has exactly one set bit; (n-1) has all bits below it set
# ANDing gives 0 only for powers of 2. Special case 0 to avoid false positive.
if n & (n - 1) == 0 and n != 0:
    return True

Type 3: Mathematical derivations or formulas

When code implements a mathematical formula, the derivation is often non-trivial:

1
2
3
4
5
6
7
8
# Without comment: Where does this come from?
ways = math.comb(n + k - 1, k - 1)
 
# With comment: Math explained
# Stars and bars theorem: distributing n identical items into k distinct bins
# is equivalent to placing k-1 dividers among n+k-1 positions
# Number of ways = C(n + k - 1, k - 1)
ways = math.comb(n + k - 1, k - 1)

1
2
3
4
5
6
7
# Without comment: Magic number
mid = left + (right - left) // 2
 
# With comment: Subtle but important
# Use (left + right) // 2 risks integer overflow for large indices
# (left + right) could exceed max int. This formulation avoids it.
mid = left + (right - left) // 2

Type 4: Performance optimizations

Code that sacrifices readability for performance needs justification:

1
2
3
4
5
6
7
8
9
10
11
# Without comment: Why this strange structure?
seen = [False] * (max_val + 1)
for num in nums:
    seen[num] = True
 
# With comment: Performance rationale
# Use array instead of set for O(1) lookup with better cache locality
# Benchmarked 3x faster for n > 10000; acceptable since max_val bounded by 10^6
seen = [False] * (max_val + 1)
for num in nums:
    seen[num] = True

The Derivation Trail

If you spent significant time deriving, proving, or debugging something, that effort represents information. Capture it in a comment. Future maintainers (and future you) will benefit from knowing why something is the way it is.

Documenting Invariants and Preconditions

Algorithmic correctness often depends on invariants—properties that remain true throughout execution—and preconditions— assumptions that must hold when code is entered. Documenting these is critical for understanding and maintaining algorithms.

Loop invariants:

A loop invariant is a condition that is true before each iteration of a loop. Stating it helps readers understand why the loop is correct.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def binary_search(arr: list[int], target: int) -> int:
    """Find index of target in sorted array, or -1 if not found."""
    left, right = 0, len(arr) - 1
    
    # INVARIANT: If target exists in arr, it is in arr[left:right+1]
    while left <= right:
        mid = left + (right - left) // 2
        
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1  # INVARIANT maintained: target not in arr[:mid+1]
        else:
            right = mid - 1  # INVARIANT maintained: target not in arr[mid:]
    
    # INVARIANT implies: target not in arr (since left > right means empty range)
    return -1

Preconditions and postconditions:

Documenting what must be true when entering and what is guaranteed when exiting prevents misuse and simplifies debugging:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def partition(arr: list[int], low: int, high: int) -> int:
    """
    Partition arr[low:high+1] around a pivot element.
    
    PRECONDITIONS:
        - 0 <= low <= high < len(arr)
        - arr has at least one element in range [low, high]
    
    POSTCONDITIONS:
        - Returns pivot_index where low <= pivot_index <= high
        - All elements arr[low:pivot_index] <= arr[pivot_index]
        - All elements arr[pivot_index+1:high+1] > arr[pivot_index]
        - Elements are permuted, not added or removed
    """
    pivot = arr[high]
    i = low - 1
    
    for j in range(low, high):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]
    
    arr[i + 1], arr[high] = arr[high], arr[i + 1]
    return i + 1

DP state definitions:

In dynamic programming, the meaning of the DP table entries is crucial and often non-obvious. Always document it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def knapsack_01(weights: list[int], values: list[int], capacity: int) -> int:
    """
    Solve 0/1 knapsack: maximize value with given weight capacity.
    
    DP STATE DEFINITION:
        dp[i][w] = maximum value achievable using items 0..i-1 
                   with knapsack capacity exactlyw
    
    BASE CASE:
        dp[0][w] = 0 for all w (no items means no value)
    
    TRANSITION:
        dp[i][w] = max(
            dp[i-1][w],                    # Don't take item i-1
            dp[i-1][w-weight[i-1]] + values[i-1]  # Take item i-1 (if fits)
        )
    
    ANSWER:
        dp[n][capacity]
    """
    n = len(weights)
    dp = [[0] * (capacity + 1) for _ in range(n + 1)]
    
    for i in range(1, n + 1):
        for w in range(capacity + 1):
            # Don't take item i-1
            dp[i][w] = dp[i - 1][w]
            
            # Take item i-1 if it fits
            item_weight = weights[i - 1]
            if item_weight <= w:
                take_value = dp[i - 1][w - item_weight] + values[i - 1]
                dp[i][w] = max(dp[i][w], take_value)
    
    return dp[n][capacity]

The Correctness Proof

When documenting invariants, think of it as an informal correctness proof. 'If this invariant holds and the loop terminates, then the result is correct.' This structured thinking catches bugs during development and documents reasoning for future readers.

Documenting Bug Fixes and Edge Cases

Some of the most valuable comments are those that explain why code handles specific edge cases or was changed to fix a bug. Without these comments, future developers (including yourself) may 'clean up' the code by removing apparently unnecessary complexity—and reintroducing the bug.

Edge case documentation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def find_median_sorted_arrays(nums1: list[int], nums2: list[int]) -> float:
    # Ensure nums1 is the smaller array to minimize binary search range
    if len(nums1) > len(nums2):
        nums1, nums2 = nums2, nums1
    
    m, n = len(nums1), len(nums2)
    
    # EDGE CASE: Both arrays empty - no median exists
    if m + n == 0:
        raise ValueError("Cannot find median of empty arrays")
    
    # EDGE CASE: One array empty - median is from the other array
    # Must handle before binary search which assumes both non-empty
    if m == 0:
        mid = n // 2
        if n % 2 == 1:
            return float(nums2[mid])
        return (nums2[mid - 1] + nums2[mid]) / 2.0
    
    # ... main algorithm for non-empty arrays

Bug fix documentation:

When you fix a bug, document what went wrong so the 'fix' isn't removed later:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def longest_palindromic_substring(s: str) -> str:
    if len(s) < 2:
        return s
    
    # BUG FIX: Previously used range(len(s)-1) which missed single-character
    # palindromes at the end. Example: "abc" would miss 'c' as potential center.
    for i in range(len(s)):
        # Expand around center
        pass
 
def count_inversions(arr: list[int]) -> int:
    # BUG FIX: Must copy array before sort, otherwise modifies input
    # Original code sorted arr in-place, causing side effects for callers
    working_copy = arr.copy()
    return merge_sort_and_count(working_copy, 0, len(working_copy))

Reference to issue trackers or discussions:

For complex bugs, link to external resources:

1
2
3
4
5
6
7
8
9
10
def parse_integer(s: str) -> int:
    # Handle integer overflow for 32-bit systems
    # See: https://github.com/company/repo/issues/1234
    # TL;DR: Python ints are arbitrary precision but we need to match
    # 32-bit C behavior for compatibility
    MAX_INT = 2**31 - 1
    MIN_INT = -2**31
    
    result = int(s)
    return max(MIN_INT, min(MAX_INT, result))

The Chesterton's Fence Principle

Before removing 'unnecessary' code, understand why it was added. A comment explaining 'this handles the edge case where X' is a fence: it warns future developers that removing the code will break X. Without the comment, the fence seems pointless and gets removed.

Algorithm Source and Reference Comments

Algorithms don't emerge from vacuum. They come from papers, textbooks, blog posts, and discussions. Documenting sources provides valuable context and enables deeper understanding.

Citing algorithm sources:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def compute_lps_array(pattern: str) -> list[int]:
    """
    Compute the Longest Proper Prefix Suffix (LPS) array for KMP algorithm.
    
    Reference: Knuth, Morris, and Pratt (1977)
    "Fast Pattern Matching in Strings", SIAM J. Comput.
    
    The LPS array enables O(n+m) pattern matching by avoiding redundant
    comparisons when a mismatch occurs.
    """
    m = len(pattern)
    lps = [0] * m
    length = 0  # Length of previous longest proper prefix suffix
    i = 1
    
    while i < m:
        if pattern[i] == pattern[length]:
            length += 1
            lps[i] = length
            i += 1
        elif length > 0:
            # Key insight from KMP: fall back to shorter prefix
            length = lps[length - 1]
        else:
            lps[i] = 0
            i += 1
    
    return lps

Linking to visualizations and tutorials:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
def union_find_with_path_compression(n: int):
    """
    Union-Find (Disjoint Set Union) with path compression and union by rank.
    
    Visualization: https://visualgo.net/en/ufds
    Detailed explanation: https://cp-algorithms.com/data_structures/disjoint_set_union.html
    
    Time complexity:
        - Nearly O(1) amortized per operation (inverse Ackermann function)
        - Proof: Tarjan, R. E. (1975) "Efficiency of a Good But Not Linear 
          Set Union Algorithm"
    """
    parent = list(range(n))
    rank = [0] * n
    
    def find(x: int) -> int:
        # Path compression: point directly to root
        if parent[x] != x:
            parent[x] = find(parent[x])
        return parent[x]
    
    def union(x: int, y: int) -> bool:
        root_x, root_y = find(x), find(y)
        if root_x == root_y:
            return False
        
        # Union by rank: attach smaller tree under larger
        if rank[root_x] < rank[root_y]:
            parent[root_x] = root_y
        elif rank[root_x] > rank[root_y]:
            parent[root_y] = root_x
        else:
            parent[root_y] = root_x
            rank[root_x] += 1
        
        return True
    
    return find, union

Noting deviations from canonical implementations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def dijkstra_modified(graph, source, target):
    """
    Modified Dijkstra's algorithm with early termination.
    
    Standard Dijkstra computes shortest paths to ALL nodes.
    This version terminates when target is reached, which is
    sufficient for single-target queries and potentially faster.
    
    NOTE: This optimization is only valid because all edge weights
    are non-negative. With negative weights, early termination
    could miss shorter paths discovered later.
    
    See: Cormen et al., "Introduction to Algorithms", Chapter 24
    """
    pass

Future Proofing with References

References enable future readers to understand the algorithm deeply without reverse-engineering your code. If they need to modify the algorithm, they can consult the original source. If they suspect a bug, they can compare against the canonical version.

Documenting Time and Space Complexity

Complexity analysis is essential for algorithmic code. Documenting complexity helps users understand performance characteristics and helps maintainers verify that changes don't degrade efficiency.

Standard complexity documentation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def merge_sort(arr: list[int]) -> list[int]:
    """
    Sort array using merge sort algorithm.
    
    Time Complexity: O(n log n)
        - log n levels of recursion (halving each time)
        - O(n) work at each level (merging)
    
    Space Complexity: O(n)
        - O(n) for auxiliary arrays during merge
        - O(log n) for recursion stack (dominated by O(n))
    
    Stability: Stable (equal elements maintain relative order)
    """
    pass

Explaining non-obvious complexities:

When complexity isn't immediately apparent from the code, explain the analysis:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def longest_increasing_subsequence(nums: list[int]) -> int:
    """
    Find length of longest strictly increasing subsequence.
    
    Time Complexity: O(n log n)
        - O(n) to iterate through nums
        - O(log n) binary search per element
        
        NOTE: Despite having a nested binary search inside a loop,
        the total work is O(n log n), not O(n * something worse).
        Each binary search is O(log k) where k <= n, and we do n
        such searches.
    
    Space Complexity: O(n)
        - tails array stores at most n elements
    """
    tails = []
    for num in nums:
        pos = bisect.bisect_left(tails, num)
        if pos == len(tails):
            tails.append(num)
        else:
            tails[pos] = num
    return len(tails)

1
2
3
4
5
6
7
8
9
10
11
12
13
def two_pointer_dedup(arr: list[int]) -> int:
    """
    Remove duplicates from sorted array in-place.
    
    Time Complexity: O(n) — single pass through array
    Space Complexity: O(1) — no extra space beyond input
    
    COMPLEXITY NOTE: Although we mutate the array, we do so in-place
    using two pointers. This achieves O(1) space which is sometimes
    called "O(n) space" if you count the input. Here we use the
    convention that input modification doesn't count as extra space.
    """
    pass

Amortized complexity explanation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class DynamicArray:
    """
    Array with automatic resizing.
    
    Append Operation:
        - Worst case: O(n) when resizing required
        - Amortized: O(1) per operation
        
        AMORTIZATION EXPLANATION:
        Doubling strategy means resize happens after 1, 2, 4, 8, ...
        operations. Total resize cost for n insertions:
        1 + 2 + 4 + ... + n/2 + n ≈ 2n = O(n)
        Spread over n operations: O(n)/n = O(1) amortized.
        
        IMPORTANT: If we resized by adding a constant instead of
        doubling, amortized cost would be O(n), not O(1).
    """
    pass

Complexity for Different Scenarios

Document best, average, and worst case if they differ significantly. 'O(n) average, O(n²) worst case (when all elements hash to same bucket)' is more informative than just 'O(n²)'.

Keeping Comments Accurate: The Maintenance Challenge

Comments can become lies. When code changes but comments don't, comments become actively harmful—they mislead readers into believing things that are no longer true.

The trust erosion problem:

When developers encounter outdated comments, they lose trust in all comments. This is worse than having no comments at all, because now they can't trust any documentation.

Strategies for comment maintenance:

Keeping Comments Current

•Review comments when modifying code — Make comment updates part of your change, not an afterthought.
•Delete obsolete comments — A deleted comment can't mislead. If a comment no longer applies, remove it.
•Avoid commenting on implementation details that may change — Comment what and why at a higher level; implementation specifics in comments become outdated fastest.
•Use code review to catch stale comments — Reviewers should verify comments still match code.
•Prefer self-documenting code over comments — Code that needs no comment never becomes outdated.

1
2
3
4
5
6
7
# DANGEROUS: Comment describes implementation that may change
# We use quicksort with median-of-three pivot selection
sorted_data = optimized_sort(data)
 
# SAFER: Comment describes behavior, not implementation
# Sort is stable and handles edge cases (empty, single element)
sorted_data = optimized_sort(data)

Signs of stale comments:

Staleness Indicators

•Variable name in comment doesn't exist — Code was renamed, comment wasn't.
•Comment describes logic that doesn't match — Code was modified, comment wasn't.
•Comment references removed functionality — Feature was deleted, comment lingers.
•TODO with no owner or date — Ancient intention, probably outdated.
•Commented-out code — Often the comment above it is stale too.

The Cost of Lies

A misleading comment is worse than no comment. Someone debugging will trust the comment, form incorrect hypotheses, and waste hours. The brief effort of updating or deleting comments pays enormous dividends in preventing such waste.

Comment Formatting and Style Guide

How you write comments matters as much as what you write. Consistent formatting improves scanability and professionalism.

Inline comments:

1
2
3
4
5
6
7
8
9
10
# Good: Brief, explains non-obvious aspect
mid = left + (right - left) // 2  # Avoids overflow vs (left+right)//2
 
# Bad: Too long for inline
mid = left + (right - left) // 2  # This formula is used instead of the more intuitive (left + right) // 2 because adding left and right together could cause integer overflow in languages like C++ or Java when both values are close to the maximum integer value
 
# Good: Long explanation above, not inline
# Overflow prevention: (left + right) could exceed max int for large indices.
# This formulation is mathematically equivalent but safe.
mid = left + (right - left) // 2

Block comments:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Good: Structured block comment
# ============================================
# PHASE 2: BUILD THE SUFFIX ARRAY
# ============================================
# Suffix arrays enable O(m log n) pattern matching after O(n log n)
# preprocessing. We use the DC3 algorithm (Kärkkäinen & Sanders, 2003)
# for O(n) construction.
#
# Key insight: Recursively sort suffixes at positions divisible by 3,
# then use those to deduce order of remaining suffixes.
# ============================================
 
# Bad: Wall of text
# Now we're going to build a suffix array which is an array of integers giving the starting positions of suffixes of a string in lexicographical order and we use the DC3 algorithm which was published by Kärkkäinen and Sanders in 2003 and achieves O(n) time complexity by recursively sorting suffixes and the key insight is that we sort positions divisible by 3 first

Docstrings:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def find_kth_largest(nums: list[int], k: int) -> int:
    """
    Find the k-th largest element in an unsorted array.
    
    Args:
        nums: List of integers (may contain duplicates)
        k: Position from largest (1-indexed, so k=1 means largest)
    
    Returns:
        The k-th largest element
    
    Raises:
        ValueError: If k < 1 or k > len(nums)
    
    Example:
        >>> find_kth_largest([3, 2, 1, 5, 6, 4], 2)
        5
    
    Time: O(n) average using Quickselect, O(n²) worst case
    Space: O(1) excluding recursion stack
    
    Note:
        For guaranteed O(n log n), use heap-based approach instead.
    """
    pass

The Consistency Imperative

Follow your team's or language's conventions. Python has PEP 257 for docstrings. JavaScript has JSDoc. Consistency within a codebase trumps any particular style. If the codebase uses a format, match it.

Summary: Strategic Commenting

We've explored the art of commenting non-obvious logic. Let's consolidate the key principles:

Key Takeaways

•Comment why, not what — Code shows what happens; comments should explain why.
•Recognize comment-worthy patterns — Boundaries, bit tricks, math derivations, and performance optimizations deserve explanation.
•Document invariants and preconditions — These formal assertions aid understanding and catch bugs.
•Preserve bug-fix knowledge — Comments explaining edge cases prevent regression.
•Cite sources and references — Link to papers, tutorials, or visualizations for deeper understanding.
•Include complexity analysis — Time and space complexity are essential documentation.
•Maintain comments as code changes — Outdated comments are worse than no comments.
•Follow consistent formatting — Style matters for scanability and professionalism.

What's next:

With readable structure, meaningful names, well-extracted helpers, and strategic comments, your code is nearly production-ready. The final page addresses the culmination: code review readiness—preparing your algorithmic code to withstand scrutiny and pass review efficiently.

Page Complete

You now understand when and how to comment algorithmic code effectively. Strategic commenting illuminates the non-obvious without cluttering the obvious. Next, we'll prepare code for the ultimate test: peer review.

4 / 5

Loading learning content...

Data Structures & AlgorithmsFrom Solution to Clean Code

From Solution to Clean Code

LevelAdvanced

Duration60 mins

TopicFrom Solution to Clean Code

4 / 5

Commenting Non-Obvious Logic

The Paradox of Comments

The real question isn't whether to comment, but what to comment.

This page teaches you to comment strategically—adding value where clarity is needed while trusting your code (and naming) to speak for itself elsewhere.

What You Will Learn

The Cardinal Rule: Comment Why, Not What

The most important principle in commenting is this: Code tells you what is happening; comments should tell you why.

What comments (usually bad):

1
2
3
4
5
6
7
8
9
# Bad: Repeats what code clearly says
count = 0  # Initialize count to zero
count += 1  # Increment count by one
return result  # Return the result
 
# Bad: Describes the obvious
for item in items:  # Loop through items
    if item > threshold:  # If item is greater than threshold
        selected.append(item)  # Add item to selected list

Why comments (valuable):

1
2
3
4
5
6
7
8
9
10
11
# Good: Explains WHY this specific value
count = 0  # Start from scratch; previous counts are invalidated by buffer flush
 
# Good: Explains the reasoning
count += 1  # Account for the implicit root node not in the tree
 
# Good: Explains the non-obvious purpose
for item in items:
    # Filter early to avoid O(n²) blowup in downstream processing
    if item > threshold:
        selected.append(item)

The Information Test

The exception: Comments that serve as section headers or navigation aids can describe what while still adding value by structuring the reading experience.

1
2
3
4
5
6
7
8
9
def complex_algorithm(data):
    # ===== PHASE 1: PREPROCESSING =====
    # ... preprocessing code ...
    
    # ===== PHASE 2: CORE COMPUTATION =====
    # ... main algorithm ...
    
    # ===== PHASE 3: RESULT EXTRACTION =====
    # ... extract results ...

Types of Logic That Demand Comments

Certain patterns in algorithmic code are inherently non-obvious. Learning to recognize these patterns helps you know when comments are essential.

Type 1: Non-obvious boundary conditions

Boundary conditions in algorithms often seem arbitrary without explanation:

1
2
3
4
5
6
7
8
# Without comment: Why n-1? Why not n?
for i in range(n - 1):
    relax_all_edges(graph)
 
# With comment: Purpose is clear
# Bellman-Ford: n-1 iterations sufficient because shortest path has at most n-1 edges
for i in range(n - 1):
    relax_all_edges(graph)

1
2
3
4
5
6
7
# Without comment: Mysterious +1
table_size = 2 * n + 1
 
# With comment: Reasoning documented
# Segment tree needs 2*n-1 nodes, but we round up to next power of 2
# Adding 1 accounts for 1-based indexing
table_size = 2 * n + 1

Type 2: Clever bit manipulation tricks

Bit manipulation often relies on properties that aren't self-evident:

1
2
3
4
5
6
7
# Without comment: Magic incantation
n = n & (n - 1)
 
# With comment: Trick explained
# Clear the rightmost set bit: subtracting 1 flips all bits up to and 
# including the rightmost 1, ANDing removes exactly that bit
n = n & (n - 1)

1
2
3
4
5
6
7
8
9
# Without comment: What is this checking?
if n & (n - 1) == 0 and n != 0:
    return True
 
# With comment: Property documented
# Power of 2 has exactly one set bit; (n-1) has all bits below it set
# ANDing gives 0 only for powers of 2. Special case 0 to avoid false positive.
if n & (n - 1) == 0 and n != 0:
    return True

Type 3: Mathematical derivations or formulas

When code implements a mathematical formula, the derivation is often non-trivial:

1
2
3
4
5
6
7
8
# Without comment: Where does this come from?
ways = math.comb(n + k - 1, k - 1)
 
# With comment: Math explained
# Stars and bars theorem: distributing n identical items into k distinct bins
# is equivalent to placing k-1 dividers among n+k-1 positions
# Number of ways = C(n + k - 1, k - 1)
ways = math.comb(n + k - 1, k - 1)

1
2
3
4
5
6
7
# Without comment: Magic number
mid = left + (right - left) // 2
 
# With comment: Subtle but important
# Use (left + right) // 2 risks integer overflow for large indices
# (left + right) could exceed max int. This formulation avoids it.
mid = left + (right - left) // 2

Type 4: Performance optimizations

Code that sacrifices readability for performance needs justification:

1
2
3
4
5
6
7
8
9
10
11
# Without comment: Why this strange structure?
seen = [False] * (max_val + 1)
for num in nums:
    seen[num] = True
 
# With comment: Performance rationale
# Use array instead of set for O(1) lookup with better cache locality
# Benchmarked 3x faster for n > 10000; acceptable since max_val bounded by 10^6
seen = [False] * (max_val + 1)
for num in nums:
    seen[num] = True

The Derivation Trail

Documenting Invariants and Preconditions

Loop invariants:

A loop invariant is a condition that is true before each iteration of a loop. Stating it helps readers understand why the loop is correct.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def binary_search(arr: list[int], target: int) -> int:
    """Find index of target in sorted array, or -1 if not found."""
    left, right = 0, len(arr) - 1
    
    # INVARIANT: If target exists in arr, it is in arr[left:right+1]
    while left <= right:
        mid = left + (right - left) // 2
        
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1  # INVARIANT maintained: target not in arr[:mid+1]
        else:
            right = mid - 1  # INVARIANT maintained: target not in arr[mid:]
    
    # INVARIANT implies: target not in arr (since left > right means empty range)
    return -1

Preconditions and postconditions:

Documenting what must be true when entering and what is guaranteed when exiting prevents misuse and simplifies debugging:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def partition(arr: list[int], low: int, high: int) -> int:
    """
    Partition arr[low:high+1] around a pivot element.
    
    PRECONDITIONS:
        - 0 <= low <= high < len(arr)
        - arr has at least one element in range [low, high]
    
    POSTCONDITIONS:
        - Returns pivot_index where low <= pivot_index <= high
        - All elements arr[low:pivot_index] <= arr[pivot_index]
        - All elements arr[pivot_index+1:high+1] > arr[pivot_index]
        - Elements are permuted, not added or removed
    """
    pivot = arr[high]
    i = low - 1
    
    for j in range(low, high):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]
    
    arr[i + 1], arr[high] = arr[high], arr[i + 1]
    return i + 1

DP state definitions:

In dynamic programming, the meaning of the DP table entries is crucial and often non-obvious. Always document it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def knapsack_01(weights: list[int], values: list[int], capacity: int) -> int:
    """
    Solve 0/1 knapsack: maximize value with given weight capacity.
    
    DP STATE DEFINITION:
        dp[i][w] = maximum value achievable using items 0..i-1 
                   with knapsack capacity exactlyw
    
    BASE CASE:
        dp[0][w] = 0 for all w (no items means no value)
    
    TRANSITION:
        dp[i][w] = max(
            dp[i-1][w],                    # Don't take item i-1
            dp[i-1][w-weight[i-1]] + values[i-1]  # Take item i-1 (if fits)
        )
    
    ANSWER:
        dp[n][capacity]
    """
    n = len(weights)
    dp = [[0] * (capacity + 1) for _ in range(n + 1)]
    
    for i in range(1, n + 1):
        for w in range(capacity + 1):
            # Don't take item i-1
            dp[i][w] = dp[i - 1][w]
            
            # Take item i-1 if it fits
            item_weight = weights[i - 1]
            if item_weight <= w:
                take_value = dp[i - 1][w - item_weight] + values[i - 1]
                dp[i][w] = max(dp[i][w], take_value)
    
    return dp[n][capacity]

The Correctness Proof

Documenting Bug Fixes and Edge Cases

Edge case documentation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def find_median_sorted_arrays(nums1: list[int], nums2: list[int]) -> float:
    # Ensure nums1 is the smaller array to minimize binary search range
    if len(nums1) > len(nums2):
        nums1, nums2 = nums2, nums1
    
    m, n = len(nums1), len(nums2)
    
    # EDGE CASE: Both arrays empty - no median exists
    if m + n == 0:
        raise ValueError("Cannot find median of empty arrays")
    
    # EDGE CASE: One array empty - median is from the other array
    # Must handle before binary search which assumes both non-empty
    if m == 0:
        mid = n // 2
        if n % 2 == 1:
            return float(nums2[mid])
        return (nums2[mid - 1] + nums2[mid]) / 2.0
    
    # ... main algorithm for non-empty arrays

Bug fix documentation:

When you fix a bug, document what went wrong so the 'fix' isn't removed later:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def longest_palindromic_substring(s: str) -> str:
    if len(s) < 2:
        return s
    
    # BUG FIX: Previously used range(len(s)-1) which missed single-character
    # palindromes at the end. Example: "abc" would miss 'c' as potential center.
    for i in range(len(s)):
        # Expand around center
        pass
 
def count_inversions(arr: list[int]) -> int:
    # BUG FIX: Must copy array before sort, otherwise modifies input
    # Original code sorted arr in-place, causing side effects for callers
    working_copy = arr.copy()
    return merge_sort_and_count(working_copy, 0, len(working_copy))

Reference to issue trackers or discussions:

For complex bugs, link to external resources:

1
2
3
4
5
6
7
8
9
10
def parse_integer(s: str) -> int:
    # Handle integer overflow for 32-bit systems
    # See: https://github.com/company/repo/issues/1234
    # TL;DR: Python ints are arbitrary precision but we need to match
    # 32-bit C behavior for compatibility
    MAX_INT = 2**31 - 1
    MIN_INT = -2**31
    
    result = int(s)
    return max(MIN_INT, min(MAX_INT, result))

The Chesterton's Fence Principle

Algorithm Source and Reference Comments

Algorithms don't emerge from vacuum. They come from papers, textbooks, blog posts, and discussions. Documenting sources provides valuable context and enables deeper understanding.

Citing algorithm sources:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def compute_lps_array(pattern: str) -> list[int]:
    """
    Compute the Longest Proper Prefix Suffix (LPS) array for KMP algorithm.
    
    Reference: Knuth, Morris, and Pratt (1977)
    "Fast Pattern Matching in Strings", SIAM J. Comput.
    
    The LPS array enables O(n+m) pattern matching by avoiding redundant
    comparisons when a mismatch occurs.
    """
    m = len(pattern)
    lps = [0] * m
    length = 0  # Length of previous longest proper prefix suffix
    i = 1
    
    while i < m:
        if pattern[i] == pattern[length]:
            length += 1
            lps[i] = length
            i += 1
        elif length > 0:
            # Key insight from KMP: fall back to shorter prefix
            length = lps[length - 1]
        else:
            lps[i] = 0
            i += 1
    
    return lps

Linking to visualizations and tutorials:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
def union_find_with_path_compression(n: int):
    """
    Union-Find (Disjoint Set Union) with path compression and union by rank.
    
    Visualization: https://visualgo.net/en/ufds
    Detailed explanation: https://cp-algorithms.com/data_structures/disjoint_set_union.html
    
    Time complexity:
        - Nearly O(1) amortized per operation (inverse Ackermann function)
        - Proof: Tarjan, R. E. (1975) "Efficiency of a Good But Not Linear 
          Set Union Algorithm"
    """
    parent = list(range(n))
    rank = [0] * n
    
    def find(x: int) -> int:
        # Path compression: point directly to root
        if parent[x] != x:
            parent[x] = find(parent[x])
        return parent[x]
    
    def union(x: int, y: int) -> bool:
        root_x, root_y = find(x), find(y)
        if root_x == root_y:
            return False
        
        # Union by rank: attach smaller tree under larger
        if rank[root_x] < rank[root_y]:
            parent[root_x] = root_y
        elif rank[root_x] > rank[root_y]:
            parent[root_y] = root_x
        else:
            parent[root_y] = root_x
            rank[root_x] += 1
        
        return True
    
    return find, union

Noting deviations from canonical implementations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def dijkstra_modified(graph, source, target):
    """
    Modified Dijkstra's algorithm with early termination.
    
    Standard Dijkstra computes shortest paths to ALL nodes.
    This version terminates when target is reached, which is
    sufficient for single-target queries and potentially faster.
    
    NOTE: This optimization is only valid because all edge weights
    are non-negative. With negative weights, early termination
    could miss shorter paths discovered later.
    
    See: Cormen et al., "Introduction to Algorithms", Chapter 24
    """
    pass

Future Proofing with References

Documenting Time and Space Complexity

Complexity analysis is essential for algorithmic code. Documenting complexity helps users understand performance characteristics and helps maintainers verify that changes don't degrade efficiency.

Standard complexity documentation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def merge_sort(arr: list[int]) -> list[int]:
    """
    Sort array using merge sort algorithm.
    
    Time Complexity: O(n log n)
        - log n levels of recursion (halving each time)
        - O(n) work at each level (merging)
    
    Space Complexity: O(n)
        - O(n) for auxiliary arrays during merge
        - O(log n) for recursion stack (dominated by O(n))
    
    Stability: Stable (equal elements maintain relative order)
    """
    pass

Explaining non-obvious complexities:

When complexity isn't immediately apparent from the code, explain the analysis:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def longest_increasing_subsequence(nums: list[int]) -> int:
    """
    Find length of longest strictly increasing subsequence.
    
    Time Complexity: O(n log n)
        - O(n) to iterate through nums
        - O(log n) binary search per element
        
        NOTE: Despite having a nested binary search inside a loop,
        the total work is O(n log n), not O(n * something worse).
        Each binary search is O(log k) where k <= n, and we do n
        such searches.
    
    Space Complexity: O(n)
        - tails array stores at most n elements
    """
    tails = []
    for num in nums:
        pos = bisect.bisect_left(tails, num)
        if pos == len(tails):
            tails.append(num)
        else:
            tails[pos] = num
    return len(tails)

1
2
3
4
5
6
7
8
9
10
11
12
13
def two_pointer_dedup(arr: list[int]) -> int:
    """
    Remove duplicates from sorted array in-place.
    
    Time Complexity: O(n) — single pass through array
    Space Complexity: O(1) — no extra space beyond input
    
    COMPLEXITY NOTE: Although we mutate the array, we do so in-place
    using two pointers. This achieves O(1) space which is sometimes
    called "O(n) space" if you count the input. Here we use the
    convention that input modification doesn't count as extra space.
    """
    pass

Amortized complexity explanation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class DynamicArray:
    """
    Array with automatic resizing.
    
    Append Operation:
        - Worst case: O(n) when resizing required
        - Amortized: O(1) per operation
        
        AMORTIZATION EXPLANATION:
        Doubling strategy means resize happens after 1, 2, 4, 8, ...
        operations. Total resize cost for n insertions:
        1 + 2 + 4 + ... + n/2 + n ≈ 2n = O(n)
        Spread over n operations: O(n)/n = O(1) amortized.
        
        IMPORTANT: If we resized by adding a constant instead of
        doubling, amortized cost would be O(n), not O(1).
    """
    pass

Complexity for Different Scenarios

Document best, average, and worst case if they differ significantly. 'O(n) average, O(n²) worst case (when all elements hash to same bucket)' is more informative than just 'O(n²)'.

Keeping Comments Accurate: The Maintenance Challenge

Comments can become lies. When code changes but comments don't, comments become actively harmful—they mislead readers into believing things that are no longer true.

The trust erosion problem:

When developers encounter outdated comments, they lose trust in all comments. This is worse than having no comments at all, because now they can't trust any documentation.

Strategies for comment maintenance:

Keeping Comments Current

•Review comments when modifying code — Make comment updates part of your change, not an afterthought.
•Delete obsolete comments — A deleted comment can't mislead. If a comment no longer applies, remove it.
•Avoid commenting on implementation details that may change — Comment what and why at a higher level; implementation specifics in comments become outdated fastest.
•Use code review to catch stale comments — Reviewers should verify comments still match code.
•Prefer self-documenting code over comments — Code that needs no comment never becomes outdated.

1
2
3
4
5
6
7
# DANGEROUS: Comment describes implementation that may change
# We use quicksort with median-of-three pivot selection
sorted_data = optimized_sort(data)
 
# SAFER: Comment describes behavior, not implementation
# Sort is stable and handles edge cases (empty, single element)
sorted_data = optimized_sort(data)

Signs of stale comments:

Staleness Indicators

•Variable name in comment doesn't exist — Code was renamed, comment wasn't.
•Comment describes logic that doesn't match — Code was modified, comment wasn't.
•Comment references removed functionality — Feature was deleted, comment lingers.
•TODO with no owner or date — Ancient intention, probably outdated.
•Commented-out code — Often the comment above it is stale too.

The Cost of Lies

Comment Formatting and Style Guide

How you write comments matters as much as what you write. Consistent formatting improves scanability and professionalism.

Inline comments:

1
2
3
4
5
6
7
8
9
10
# Good: Brief, explains non-obvious aspect
mid = left + (right - left) // 2  # Avoids overflow vs (left+right)//2
 
# Bad: Too long for inline
mid = left + (right - left) // 2  # This formula is used instead of the more intuitive (left + right) // 2 because adding left and right together could cause integer overflow in languages like C++ or Java when both values are close to the maximum integer value
 
# Good: Long explanation above, not inline
# Overflow prevention: (left + right) could exceed max int for large indices.
# This formulation is mathematically equivalent but safe.
mid = left + (right - left) // 2

Block comments:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Good: Structured block comment
# ============================================
# PHASE 2: BUILD THE SUFFIX ARRAY
# ============================================
# Suffix arrays enable O(m log n) pattern matching after O(n log n)
# preprocessing. We use the DC3 algorithm (Kärkkäinen & Sanders, 2003)
# for O(n) construction.
#
# Key insight: Recursively sort suffixes at positions divisible by 3,
# then use those to deduce order of remaining suffixes.
# ============================================
 
# Bad: Wall of text
# Now we're going to build a suffix array which is an array of integers giving the starting positions of suffixes of a string in lexicographical order and we use the DC3 algorithm which was published by Kärkkäinen and Sanders in 2003 and achieves O(n) time complexity by recursively sorting suffixes and the key insight is that we sort positions divisible by 3 first

Docstrings:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def find_kth_largest(nums: list[int], k: int) -> int:
    """
    Find the k-th largest element in an unsorted array.
    
    Args:
        nums: List of integers (may contain duplicates)
        k: Position from largest (1-indexed, so k=1 means largest)
    
    Returns:
        The k-th largest element
    
    Raises:
        ValueError: If k < 1 or k > len(nums)
    
    Example:
        >>> find_kth_largest([3, 2, 1, 5, 6, 4], 2)
        5
    
    Time: O(n) average using Quickselect, O(n²) worst case
    Space: O(1) excluding recursion stack
    
    Note:
        For guaranteed O(n log n), use heap-based approach instead.
    """
    pass

The Consistency Imperative

Summary: Strategic Commenting

We've explored the art of commenting non-obvious logic. Let's consolidate the key principles:

Key Takeaways

•Comment why, not what — Code shows what happens; comments should explain why.
•Recognize comment-worthy patterns — Boundaries, bit tricks, math derivations, and performance optimizations deserve explanation.
•Document invariants and preconditions — These formal assertions aid understanding and catch bugs.
•Preserve bug-fix knowledge — Comments explaining edge cases prevent regression.
•Cite sources and references — Link to papers, tutorials, or visualizations for deeper understanding.
•Include complexity analysis — Time and space complexity are essential documentation.
•Maintain comments as code changes — Outdated comments are worse than no comments.
•Follow consistent formatting — Style matters for scanability and professionalism.

What's next:

Page Complete

4 / 5