Loading learning content...
Comments in code exist in a paradox. Too many comments clutter the code, repeat what's already evident, and become lies when the code changes but comments don't. Too few comments leave readers bewildered by clever tricks, subtle edge case handling, and mathematical derivations that took you hours to figure out.
The real question isn't whether to comment, but what to comment.
Algorithmic code is especially susceptible to this paradox. A single line implementing a bit manipulation trick might embody thirty minutes of whiteboard derivation. A boundary condition check might prevent an off-by-one error that took three debugging sessions to discover. Without comments, these insights are lost. With the wrong comments, you're just adding noise.
This page teaches you to comment strategically—adding value where clarity is needed while trusting your code (and naming) to speak for itself elsewhere.
By the end of this page, you will understand the distinction between what and why comments, master the types of algorithmic logic that demand documentation, and develop the judgment to add comments that future readers (including yourself) will thank you for.
The most important principle in commenting is this: Code tells you what is happening; comments should tell you why.
Comments that describe what the code does are usually redundant—that information is already in the code. Comments that explain why the code does something provide information that cannot be derived from the code alone.
What comments (usually bad):
123456789
# Bad: Repeats what code clearly sayscount = 0 # Initialize count to zerocount += 1 # Increment count by onereturn result # Return the result # Bad: Describes the obviousfor item in items: # Loop through items if item > threshold: # If item is greater than threshold selected.append(item) # Add item to selected listWhy comments (valuable):
1234567891011
# Good: Explains WHY this specific valuecount = 0 # Start from scratch; previous counts are invalidated by buffer flush # Good: Explains the reasoningcount += 1 # Account for the implicit root node not in the tree # Good: Explains the non-obvious purposefor item in items: # Filter early to avoid O(n²) blowup in downstream processing if item > threshold: selected.append(item)Before writing a comment, ask: 'Does this comment contain information not present in the code?' If you could delete the comment and still understand everything it said by reading the code, the comment is noise. If the comment reveals intention, history, or rationale not in the code, it's valuable.
The exception: Comments that serve as section headers or navigation aids can describe what while still adding value by structuring the reading experience.
123456789
def complex_algorithm(data): # ===== PHASE 1: PREPROCESSING ===== # ... preprocessing code ... # ===== PHASE 2: CORE COMPUTATION ===== # ... main algorithm ... # ===== PHASE 3: RESULT EXTRACTION ===== # ... extract results ...Certain patterns in algorithmic code are inherently non-obvious. Learning to recognize these patterns helps you know when comments are essential.
Type 1: Non-obvious boundary conditions
Boundary conditions in algorithms often seem arbitrary without explanation:
12345678
# Without comment: Why n-1? Why not n?for i in range(n - 1): relax_all_edges(graph) # With comment: Purpose is clear# Bellman-Ford: n-1 iterations sufficient because shortest path has at most n-1 edgesfor i in range(n - 1): relax_all_edges(graph)1234567
# Without comment: Mysterious +1table_size = 2 * n + 1 # With comment: Reasoning documented# Segment tree needs 2*n-1 nodes, but we round up to next power of 2# Adding 1 accounts for 1-based indexingtable_size = 2 * n + 1Type 2: Clever bit manipulation tricks
Bit manipulation often relies on properties that aren't self-evident:
1234567
# Without comment: Magic incantationn = n & (n - 1) # With comment: Trick explained# Clear the rightmost set bit: subtracting 1 flips all bits up to and # including the rightmost 1, ANDing removes exactly that bitn = n & (n - 1)123456789
# Without comment: What is this checking?if n & (n - 1) == 0 and n != 0: return True # With comment: Property documented# Power of 2 has exactly one set bit; (n-1) has all bits below it set# ANDing gives 0 only for powers of 2. Special case 0 to avoid false positive.if n & (n - 1) == 0 and n != 0: return TrueType 3: Mathematical derivations or formulas
When code implements a mathematical formula, the derivation is often non-trivial:
12345678
# Without comment: Where does this come from?ways = math.comb(n + k - 1, k - 1) # With comment: Math explained# Stars and bars theorem: distributing n identical items into k distinct bins# is equivalent to placing k-1 dividers among n+k-1 positions# Number of ways = C(n + k - 1, k - 1)ways = math.comb(n + k - 1, k - 1)1234567
# Without comment: Magic numbermid = left + (right - left) // 2 # With comment: Subtle but important# Use (left + right) // 2 risks integer overflow for large indices# (left + right) could exceed max int. This formulation avoids it.mid = left + (right - left) // 2Type 4: Performance optimizations
Code that sacrifices readability for performance needs justification:
1234567891011
# Without comment: Why this strange structure?seen = [False] * (max_val + 1)for num in nums: seen[num] = True # With comment: Performance rationale# Use array instead of set for O(1) lookup with better cache locality# Benchmarked 3x faster for n > 10000; acceptable since max_val bounded by 10^6seen = [False] * (max_val + 1)for num in nums: seen[num] = TrueIf you spent significant time deriving, proving, or debugging something, that effort represents information. Capture it in a comment. Future maintainers (and future you) will benefit from knowing why something is the way it is.
Algorithmic correctness often depends on invariants—properties that remain true throughout execution—and preconditions— assumptions that must hold when code is entered. Documenting these is critical for understanding and maintaining algorithms.
Loop invariants:
A loop invariant is a condition that is true before each iteration of a loop. Stating it helps readers understand why the loop is correct.
1234567891011121314151617
def binary_search(arr: list[int], target: int) -> int: """Find index of target in sorted array, or -1 if not found.""" left, right = 0, len(arr) - 1 # INVARIANT: If target exists in arr, it is in arr[left:right+1] while left <= right: mid = left + (right - left) // 2 if arr[mid] == target: return mid elif arr[mid] < target: left = mid + 1 # INVARIANT maintained: target not in arr[:mid+1] else: right = mid - 1 # INVARIANT maintained: target not in arr[mid:] # INVARIANT implies: target not in arr (since left > right means empty range) return -1Preconditions and postconditions:
Documenting what must be true when entering and what is guaranteed when exiting prevents misuse and simplifies debugging:
123456789101112131415161718192021222324
def partition(arr: list[int], low: int, high: int) -> int: """ Partition arr[low:high+1] around a pivot element. PRECONDITIONS: - 0 <= low <= high < len(arr) - arr has at least one element in range [low, high] POSTCONDITIONS: - Returns pivot_index where low <= pivot_index <= high - All elements arr[low:pivot_index] <= arr[pivot_index] - All elements arr[pivot_index+1:high+1] > arr[pivot_index] - Elements are permuted, not added or removed """ pivot = arr[high] i = low - 1 for j in range(low, high): if arr[j] <= pivot: i += 1 arr[i], arr[j] = arr[j], arr[i] arr[i + 1], arr[high] = arr[high], arr[i + 1] return i + 1DP state definitions:
In dynamic programming, the meaning of the DP table entries is crucial and often non-obvious. Always document it:
1234567891011121314151617181920212223242526272829303132333435
def knapsack_01(weights: list[int], values: list[int], capacity: int) -> int: """ Solve 0/1 knapsack: maximize value with given weight capacity. DP STATE DEFINITION: dp[i][w] = maximum value achievable using items 0..i-1 with knapsack capacity exactlyw BASE CASE: dp[0][w] = 0 for all w (no items means no value) TRANSITION: dp[i][w] = max( dp[i-1][w], # Don't take item i-1 dp[i-1][w-weight[i-1]] + values[i-1] # Take item i-1 (if fits) ) ANSWER: dp[n][capacity] """ n = len(weights) dp = [[0] * (capacity + 1) for _ in range(n + 1)] for i in range(1, n + 1): for w in range(capacity + 1): # Don't take item i-1 dp[i][w] = dp[i - 1][w] # Take item i-1 if it fits item_weight = weights[i - 1] if item_weight <= w: take_value = dp[i - 1][w - item_weight] + values[i - 1] dp[i][w] = max(dp[i][w], take_value) return dp[n][capacity]When documenting invariants, think of it as an informal correctness proof. 'If this invariant holds and the loop terminates, then the result is correct.' This structured thinking catches bugs during development and documents reasoning for future readers.
Some of the most valuable comments are those that explain why code handles specific edge cases or was changed to fix a bug. Without these comments, future developers (including yourself) may 'clean up' the code by removing apparently unnecessary complexity—and reintroducing the bug.
Edge case documentation:
1234567891011121314151617181920
def find_median_sorted_arrays(nums1: list[int], nums2: list[int]) -> float: # Ensure nums1 is the smaller array to minimize binary search range if len(nums1) > len(nums2): nums1, nums2 = nums2, nums1 m, n = len(nums1), len(nums2) # EDGE CASE: Both arrays empty - no median exists if m + n == 0: raise ValueError("Cannot find median of empty arrays") # EDGE CASE: One array empty - median is from the other array # Must handle before binary search which assumes both non-empty if m == 0: mid = n // 2 if n % 2 == 1: return float(nums2[mid]) return (nums2[mid - 1] + nums2[mid]) / 2.0 # ... main algorithm for non-empty arraysBug fix documentation:
When you fix a bug, document what went wrong so the 'fix' isn't removed later:
123456789101112131415
def longest_palindromic_substring(s: str) -> str: if len(s) < 2: return s # BUG FIX: Previously used range(len(s)-1) which missed single-character # palindromes at the end. Example: "abc" would miss 'c' as potential center. for i in range(len(s)): # Expand around center pass def count_inversions(arr: list[int]) -> int: # BUG FIX: Must copy array before sort, otherwise modifies input # Original code sorted arr in-place, causing side effects for callers working_copy = arr.copy() return merge_sort_and_count(working_copy, 0, len(working_copy))Reference to issue trackers or discussions:
For complex bugs, link to external resources:
12345678910
def parse_integer(s: str) -> int: # Handle integer overflow for 32-bit systems # See: https://github.com/company/repo/issues/1234 # TL;DR: Python ints are arbitrary precision but we need to match # 32-bit C behavior for compatibility MAX_INT = 2**31 - 1 MIN_INT = -2**31 result = int(s) return max(MIN_INT, min(MAX_INT, result))Before removing 'unnecessary' code, understand why it was added. A comment explaining 'this handles the edge case where X' is a fence: it warns future developers that removing the code will break X. Without the comment, the fence seems pointless and gets removed.
Algorithms don't emerge from vacuum. They come from papers, textbooks, blog posts, and discussions. Documenting sources provides valuable context and enables deeper understanding.
Citing algorithm sources:
12345678910111213141516171819202122232425262728
def compute_lps_array(pattern: str) -> list[int]: """ Compute the Longest Proper Prefix Suffix (LPS) array for KMP algorithm. Reference: Knuth, Morris, and Pratt (1977) "Fast Pattern Matching in Strings", SIAM J. Comput. The LPS array enables O(n+m) pattern matching by avoiding redundant comparisons when a mismatch occurs. """ m = len(pattern) lps = [0] * m length = 0 # Length of previous longest proper prefix suffix i = 1 while i < m: if pattern[i] == pattern[length]: length += 1 lps[i] = length i += 1 elif length > 0: # Key insight from KMP: fall back to shorter prefix length = lps[length - 1] else: lps[i] = 0 i += 1 return lpsLinking to visualizations and tutorials:
1234567891011121314151617181920212223242526272829303132333435363738
def union_find_with_path_compression(n: int): """ Union-Find (Disjoint Set Union) with path compression and union by rank. Visualization: https://visualgo.net/en/ufds Detailed explanation: https://cp-algorithms.com/data_structures/disjoint_set_union.html Time complexity: - Nearly O(1) amortized per operation (inverse Ackermann function) - Proof: Tarjan, R. E. (1975) "Efficiency of a Good But Not Linear Set Union Algorithm" """ parent = list(range(n)) rank = [0] * n def find(x: int) -> int: # Path compression: point directly to root if parent[x] != x: parent[x] = find(parent[x]) return parent[x] def union(x: int, y: int) -> bool: root_x, root_y = find(x), find(y) if root_x == root_y: return False # Union by rank: attach smaller tree under larger if rank[root_x] < rank[root_y]: parent[root_x] = root_y elif rank[root_x] > rank[root_y]: parent[root_y] = root_x else: parent[root_y] = root_x rank[root_x] += 1 return True return find, unionNoting deviations from canonical implementations:
123456789101112131415
def dijkstra_modified(graph, source, target): """ Modified Dijkstra's algorithm with early termination. Standard Dijkstra computes shortest paths to ALL nodes. This version terminates when target is reached, which is sufficient for single-target queries and potentially faster. NOTE: This optimization is only valid because all edge weights are non-negative. With negative weights, early termination could miss shorter paths discovered later. See: Cormen et al., "Introduction to Algorithms", Chapter 24 """ passReferences enable future readers to understand the algorithm deeply without reverse-engineering your code. If they need to modify the algorithm, they can consult the original source. If they suspect a bug, they can compare against the canonical version.
Complexity analysis is essential for algorithmic code. Documenting complexity helps users understand performance characteristics and helps maintainers verify that changes don't degrade efficiency.
Standard complexity documentation:
123456789101112131415
def merge_sort(arr: list[int]) -> list[int]: """ Sort array using merge sort algorithm. Time Complexity: O(n log n) - log n levels of recursion (halving each time) - O(n) work at each level (merging) Space Complexity: O(n) - O(n) for auxiliary arrays during merge - O(log n) for recursion stack (dominated by O(n)) Stability: Stable (equal elements maintain relative order) """ passExplaining non-obvious complexities:
When complexity isn't immediately apparent from the code, explain the analysis:
123456789101112131415161718192021222324
def longest_increasing_subsequence(nums: list[int]) -> int: """ Find length of longest strictly increasing subsequence. Time Complexity: O(n log n) - O(n) to iterate through nums - O(log n) binary search per element NOTE: Despite having a nested binary search inside a loop, the total work is O(n log n), not O(n * something worse). Each binary search is O(log k) where k <= n, and we do n such searches. Space Complexity: O(n) - tails array stores at most n elements """ tails = [] for num in nums: pos = bisect.bisect_left(tails, num) if pos == len(tails): tails.append(num) else: tails[pos] = num return len(tails)12345678910111213
def two_pointer_dedup(arr: list[int]) -> int: """ Remove duplicates from sorted array in-place. Time Complexity: O(n) — single pass through array Space Complexity: O(1) — no extra space beyond input COMPLEXITY NOTE: Although we mutate the array, we do so in-place using two pointers. This achieves O(1) space which is sometimes called "O(n) space" if you count the input. Here we use the convention that input modification doesn't count as extra space. """ passAmortized complexity explanation:
123456789101112131415161718
class DynamicArray: """ Array with automatic resizing. Append Operation: - Worst case: O(n) when resizing required - Amortized: O(1) per operation AMORTIZATION EXPLANATION: Doubling strategy means resize happens after 1, 2, 4, 8, ... operations. Total resize cost for n insertions: 1 + 2 + 4 + ... + n/2 + n ≈ 2n = O(n) Spread over n operations: O(n)/n = O(1) amortized. IMPORTANT: If we resized by adding a constant instead of doubling, amortized cost would be O(n), not O(1). """ passDocument best, average, and worst case if they differ significantly. 'O(n) average, O(n²) worst case (when all elements hash to same bucket)' is more informative than just 'O(n²)'.
Comments can become lies. When code changes but comments don't, comments become actively harmful—they mislead readers into believing things that are no longer true.
The trust erosion problem:
When developers encounter outdated comments, they lose trust in all comments. This is worse than having no comments at all, because now they can't trust any documentation.
Strategies for comment maintenance:
1234567
# DANGEROUS: Comment describes implementation that may change# We use quicksort with median-of-three pivot selectionsorted_data = optimized_sort(data) # SAFER: Comment describes behavior, not implementation# Sort is stable and handles edge cases (empty, single element)sorted_data = optimized_sort(data)Signs of stale comments:
A misleading comment is worse than no comment. Someone debugging will trust the comment, form incorrect hypotheses, and waste hours. The brief effort of updating or deleting comments pays enormous dividends in preventing such waste.
How you write comments matters as much as what you write. Consistent formatting improves scanability and professionalism.
Inline comments:
12345678910
# Good: Brief, explains non-obvious aspectmid = left + (right - left) // 2 # Avoids overflow vs (left+right)//2 # Bad: Too long for inlinemid = left + (right - left) // 2 # This formula is used instead of the more intuitive (left + right) // 2 because adding left and right together could cause integer overflow in languages like C++ or Java when both values are close to the maximum integer value # Good: Long explanation above, not inline# Overflow prevention: (left + right) could exceed max int for large indices.# This formulation is mathematically equivalent but safe.mid = left + (right - left) // 2Block comments:
1234567891011121314
# Good: Structured block comment# ============================================# PHASE 2: BUILD THE SUFFIX ARRAY# ============================================# Suffix arrays enable O(m log n) pattern matching after O(n log n)# preprocessing. We use the DC3 algorithm (Kärkkäinen & Sanders, 2003)# for O(n) construction.## Key insight: Recursively sort suffixes at positions divisible by 3,# then use those to deduce order of remaining suffixes.# ============================================ # Bad: Wall of text# Now we're going to build a suffix array which is an array of integers giving the starting positions of suffixes of a string in lexicographical order and we use the DC3 algorithm which was published by Kärkkäinen and Sanders in 2003 and achieves O(n) time complexity by recursively sorting suffixes and the key insight is that we sort positions divisible by 3 firstDocstrings:
12345678910111213141516171819202122232425
def find_kth_largest(nums: list[int], k: int) -> int: """ Find the k-th largest element in an unsorted array. Args: nums: List of integers (may contain duplicates) k: Position from largest (1-indexed, so k=1 means largest) Returns: The k-th largest element Raises: ValueError: If k < 1 or k > len(nums) Example: >>> find_kth_largest([3, 2, 1, 5, 6, 4], 2) 5 Time: O(n) average using Quickselect, O(n²) worst case Space: O(1) excluding recursion stack Note: For guaranteed O(n log n), use heap-based approach instead. """ passFollow your team's or language's conventions. Python has PEP 257 for docstrings. JavaScript has JSDoc. Consistency within a codebase trumps any particular style. If the codebase uses a format, match it.
We've explored the art of commenting non-obvious logic. Let's consolidate the key principles:
What's next:
With readable structure, meaningful names, well-extracted helpers, and strategic comments, your code is nearly production-ready. The final page addresses the culmination: code review readiness—preparing your algorithmic code to withstand scrutiny and pass review efficiently.
You now understand when and how to comment algorithmic code effectively. Strategic commenting illuminates the non-obvious without cluttering the obvious. Next, we'll prepare code for the ultimate test: peer review.