Design Validation - Learning Module

Loading content...

0/273

Edge Case Handling

Where Normal Ends and Weird Begins

Most systems work fine for 'normal' inputs. The trouble begins at the edges—where inputs are unexpectedly large or small, where timing is unusual, where users behave in ways no one anticipated, where data arrives in sequences that 'shouldn't happen.'

Edge cases are where bugs hide. They're the inputs and conditions that slip through typical testing because they're rare, unusual, or unexpected. Yet in production, at scale, 'rare' events happen constantly. A one-in-a-million bug occurs 1,000 times per day when you're handling a billion requests.

Edge case handling during design validation is about systematically identifying these boundary conditions and ensuring the design addresses them explicitly. It's the difference between a system that works in demos and a system that survives production.

What You Will Master

By the end of this page, you will understand how to systematically identify and handle edge cases in system designs. You'll learn to analyze boundary conditions, exceptional data patterns, race conditions, timing issues, and the techniques principal engineers use to ensure systems behave correctly when inputs are weird, timing is wrong, or users do unexpected things.

The Edge Case Taxonomy

Edge cases cluster into recognizable categories. Understanding these categories helps you systematically search for them rather than hoping to stumble upon them.

The Six Categories of Edge Cases

Edge Case Classification
Category	Description	Examples	Design Questions
Boundary values	Inputs at the limits of valid ranges	0, MAX_INT, empty string, exactly-at-limit	What happens at zero? At maximum? Just over the limit?
Empty/null data	Missing or absent values	Null fields, empty lists, missing records	Can every field be null? What does empty mean?
Large-scale data	Unexpectedly large inputs	Million-item lists, GB-sized files, viral content	Is there a size limit? What happens when exceeded?
Concurrent access	Multiple actors affecting same resources	Race conditions, double-submit, simultaneous edits	What happens if two requests arrive at once?
Temporal anomalies	Time-related edge conditions	Clock skew, leap seconds, timezone boundaries	What if clocks are wrong? What about time zones?
Exceptional sequences	Unusual ordering of operations	Out-of-order events, repeated requests, skipped steps	What if steps happen out of order?

Why Edge Cases Matter at Design Time

Many edge cases can't be fixed after the fact—they require architectural changes:

Handling billion-record tables requires partitioning designed upfront
Preventing race conditions requires proper locking strategies in the data model
Processing out-of-order events requires idempotent operations and proper state machines
Supporting leap seconds requires decisions about timestamp representation

These aren't bugs you can patch—they're architectural properties that must be designed in from the start.

The Production Reality

In production, every edge case eventually occurs. Users paste megabytes of text into 'name' fields. Clocks drift. Networks deliver messages out of order. Systems restart mid-transaction. Designing without considering edges is designing for a fantasy environment.

Boundary Value Analysis

Boundary values are where validation breaks. They're the transition points between valid and invalid, between one behavior and another. Testing at these boundaries reveals off-by-one errors, overflow conditions, and incorrect limit checking.

The Boundary Points

For any numeric or sized constraint, test at:

Minimum valid: The smallest acceptable value
Below minimum: Just under the minimum (should be rejected)
Maximum valid: The largest acceptable value
Above maximum: Just over the maximum (should be rejected)
Zero: Often a special case
One: Often a special case (single item vs. collection)
Power-of-two boundaries: Where data structures resize

Boundary Analysis Example: Order Quantity
Boundary	Value	Expected Behavior	Design Consideration
Below minimum	0	Reject: cannot order zero items	Validation message, what about cart display?
Minimum valid	1	Accept: minimum order	Is pricing different for single items?
Typical	3	Accept: normal order	Standard processing path
Near maximum	99	Accept: large order	Inventory check, stock availability
Maximum valid	100	Accept: maximum single order	Why this limit? Is it configurable?
Above maximum	101	Reject: over order limit	Clear error message, suggest multiple orders?
Way over	1,000,000	Reject: obvious abuse	Should this trigger fraud detection?

String and Text Boundaries

Strings have their own boundary conditions that often cause more trouble than numeric limits:

String Edge Cases

•Empty string — Is "" valid? Different from null?
•Whitespace only — Is " " valid? Should it be trimmed?
•Maximum length — What's the limit? What happens at limit+1?
•Very long strings — Megabytes of text pasted in a 'name' field
•Unicode edge cases — Multi-byte characters, RTL text, emoji, ZWJ sequences
•Null bytes — Can corrupt C-based systems
•Line breaks — Single-line field with \n pasted
•Special characters — SQL injection, XSS, format string attacks
•Non-printable characters — Control characters, escape sequences

The Zalgo Text Test

Try rendering 'Ẑ̴̧̛a̸͕̾l̸̘̓ǵ̷͎o̷̺̍' in your system. Unicode combining characters can create strings that are technically valid but render incorrectly, exceed expected display widths, or crash rendering engines. If your system accepts user input, it will eventually receive Zalgo text.

Empty and Null Handling

The billion-dollar mistake—null references—manifests in system design as ambiguous handling of missing data. Every piece of data in your system can potentially be absent, and the design must specify what that means.

The Null Spectrum

Types of 'Missing' Data
Type	Meaning	Example	Design Decision
Null	Value unknown or not applicable	User phone number not provided	How to display? Filter in queries?
Empty string	Value explicitly set to nothing	User cleared their bio	Different from null? How to distinguish?
Empty collection	Collection with zero items	User has no orders yet	Different from null collection? Display message?
Default value	Value was never set, using default	Account uses default settings	Is there a sentinel value problem?
Tombstone	Value was deleted	User deleted their profile picture	Soft delete vs. hard delete logic
Not yet loaded	Value exists but hasn't been fetched	Lazy-loaded relationship	Loading indicators, error handling

Empty Collection Edge Cases

Empty collections cause subtle bugs that are easy to miss during design review:

empty-collection-edge-cases.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
// Empty collection edge cases in system operations
 
// ❌ Division by zero when calculating average
function calculateAverageOrderValue(orders: Order[]): number {
  const total = orders.reduce((sum, o) => sum + o.value, 0);
  return total / orders.length; // NaN when orders is empty!
}
 
// ✅ Handle empty case explicitly
function calculateAverageOrderValueSafe(orders: Order[]): number | null {
  if (orders.length === 0) return null;
  const total = orders.reduce((sum, o) => sum + o.value, 0);
  return total / orders.length;
}
 
// ❌ Assumes at least one item
function getTopRatedProduct(products: Product[]): Product {
  return products.sort((a, b) => b.rating - a.rating)[0]; // undefined when empty!
}
 
// ✅ Return type reflects possibility of no result
function getTopRatedProductSafe(products: Product[]): Product | null {
  if (products.length === 0) return null;
  return products.sort((a, b) => b.rating - a.rating)[0];
}
 
// ❌ Empty result in database operation
async function getRecentOrders(userId: string): Promise<Order[]> {
  const orders = await db.query('SELECT * FROM orders WHERE user_id = ?', [userId]);
  // What if orders is empty? No error, but caller might assume at least one.
  return orders;
}
 
// ✅ Caller knows empty is a valid state
async function getRecentOrdersSafe(userId: string): Promise<{
  orders: Order[];
  hasOrders: boolean;
}> {
  const orders = await db.query('SELECT * FROM orders WHERE user_id = ?', [userId]);
  return {
    orders,
    hasOrders: orders.length > 0,
  };
}
 
// Edge case: Empty result in aggregation
// ❌ This SQL returns no rows, not a row with 0
// SELECT SUM(value) FROM orders WHERE user_id = 'new_user';
// Result: null, not 0
 
// ✅ Design must account for this
async function getTotalOrderValue(userId: string): Promise<number> {
  const result = await db.query(
    'SELECT COALESCE(SUM(value), 0) as total FROM orders WHERE user_id = ?',
    [userId]
  );
  return result[0].total;
}

API Design Implication

When designing APIs, be explicit about empty states. Should GET /users/{id}/orders return 200 with an empty array, or 404 because there are no orders? The answer affects client implementation and should be documented in the API contract.

Concurrency Edge Cases

Concurrency bugs are among the most insidious edge cases because they depend on timing—they may pass 999 tests and fail on the 1,000th. During design, you must identify where concurrent access occurs and specify the intended behavior.

The TOCTOU Gap (Time-of-Check to Time-of-Use)

The most common concurrency bug pattern: you check a condition, then act on it, but the condition changed between check and action.

Converting Mermaid diagram...

Common Concurrency Edge Cases

Concurrency Scenarios to Address
Scenario	Example	Symptom	Solution Pattern
Double submit	User clicks 'Submit' twice quickly	Duplicate orders created	Idempotency keys, dedup window
Lost update	Two users edit same document	Second save overwrites first	Optimistic locking (version numbers)
Phantom read	Read returns different results mid-transaction	Inconsistent data processing	Serializable isolation or snapshot reads
Dirty read	Read uncommitted data that gets rolled back	Actions based on phantom data	Read-committed isolation minimum
Write amplification	N clients simultaneously retry writes	N× database load	Exponential backoff with jitter
Thundering herd	Cache expires, N clients hit database	Database overload on cache miss	Cache warming, request coalescing

Anti-patterns

•Check-then-act without locks
•Assuming sequential execution
•Relying on API ordering
•Unbounded retry loops
•Shared mutable state across instances

Patterns to Apply

•Atomic operations (CAS, increment)
•Pessimistic locking for write-heavy
•Optimistic locking for read-heavy
•Idempotent operations everywhere
•Event sourcing for audit trails

Design for Idempotency

If an operation can be called multiple times with the same effect as calling it once, you've eliminated an entire class of concurrency bugs. Every write operation in your design should either be inherently idempotent or protected by an idempotency mechanism (usually a unique key + deduplication window).

Temporal Edge Cases

Time seems simple until you deal with it in distributed systems. Clocks drift, time zones complicate comparisons, daylight saving causes hours to repeat or disappear, and leap seconds add time that shouldn't exist.

Time in Distributed Systems

Different nodes have different times. Network delays mean event ordering is ambiguous. The design must specify how time is handled.

Temporal Edge Cases
Edge Case	Problem	Impact	Design Consideration
Clock skew	Node A's clock is 5 seconds ahead of Node B	'Later' event appears earlier	Use logical clocks (Lamport), or trust one time source
Clock drift	Node clock slowly diverges from true time	Timeouts and TTLs become unreliable	NTP sync, bound acceptable drift
Daylight saving	2 AM happens twice, or not at all	Scheduled jobs fire twice or not at all	Store times in UTC, convert at display only
Leap seconds	23:59:60 exists	Time comparison: 23:59:60 < 00:00:00?	Use TAI or 'smear' leap seconds
Time zone changes	Government changes offset	Historical data becomes misinterpreted	Store offset with timestamp, or use UTC
Midnight boundary	Date rolls over at different times globally	'Today' means different things	Explicit time zone in all date logic

Event Ordering Challenges

In distributed systems, you cannot rely on timestamps for ordering. Messages sent 'later' may arrive 'earlier' due to network delays.

Temporal Design Questions

•What happens if events arrive out of order? — Buffer and reorder? Process anyway? Error?
•How are timestamps generated? — Client time? Server time? Which server?
•What time zone are times stored in? — UTC everywhere is safest
•How are durations calculated? — Cross-DST calculations, cross-timezone spans
•What happens at year/month/day boundaries? — Batch jobs, retention policies
•How is 'now' determined? — In tests? In async processing? In UI display?

temporal-edge-cases.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// Temporal edge case examples
 
// ❌ This is a latent bug waiting to happen
function calculateAge(birthDate: Date): number {
  const today = new Date();
  return today.getFullYear() - birthDate.getFullYear();
  // Wrong! Someone born Dec 31, 2000 isn't 24 on Jan 1, 2024
}
 
// ✅ Account for whether birthday has occurred this year
function calculateAgeSafe(birthDate: Date): number {
  const today = new Date();
  let age = today.getFullYear() - birthDate.getFullYear();
  const monthDiff = today.getMonth() - birthDate.getMonth();
  if (monthDiff < 0 || (monthDiff === 0 && today.getDate() < birthDate.getDate())) {
    age--;
  }
  return age;
}
 
// ❌ Timezone nightmare waiting to happen
function isWeekend(date: Date): boolean {
  const day = date.getDay();
  return day === 0 || day === 6;
  // But whose weekend? User's timezone? Server's timezone?
}
 
// ✅ Explicit about timezone
function isWeekendInTimezone(date: Date, timezone: string): boolean {
  const options: Intl.DateTimeFormatOptions = { 
    weekday: 'long', 
    timeZone: timezone 
  };
  const dayName = date.toLocaleDateString('en-US', options);
  return dayName === 'Saturday' || dayName === 'Sunday';
}
 
// ❌ Daylight saving time trap
function addDays(date: Date, days: number): Date {
  return new Date(date.getTime() + days * 24 * 60 * 60 * 1000);
  // Breaks when crossing DST boundary! 
  // Adding "1 day" on DST change day gives 23 or 25 hours
}
 
// ✅ Use a proper datetime library
import { addDays as addDaysFns } from 'date-fns';
import { zonedTimeToUtc, utcToZonedTime } from 'date-fns-tz';
 
function addDaysInTimezone(date: Date, days: number, tz: string): Date {
  const zonedDate = utcToZonedTime(date, tz);
  const newZonedDate = addDaysFns(zonedDate, days);
  return zonedTimeToUtc(newZonedDate, tz);
}

The Leap Year Bug

February 29th causes issues every four years. Users born on Feb 29 can't be processed by naive date logic. Subscriptions starting Jan 31 have no equivalent date in February. "One month from now" is ambiguous. Your design must specify how these edge cases are handled.

Data Pattern Edge Cases

Users and systems produce data in patterns that developers don't anticipate. Understanding these patterns helps surface edge cases during design.

Unexpected Data Patterns

Data Pattern Edge Cases
Pattern	Description	Example	Design Consideration
Viral content	Single item accessed millions of times	Trending post, breaking news	Cache strategy, hot key handling
Celebrity users	Users with extreme follower counts	User with 100M followers posts	Fan-out strategy, async processing
Batch imports	Large volume inserted at once	Customer uploads 1M records via CSV	Rate limiting, background processing
Delete cascades	Deletion triggers massive cleanup	Deleting user with 10 years of history	Async deletion, soft deletes
Circular references	Data references itself	User 'manages' themselves, circular org chart	Cycle detection, depth limits
Name collisions	Multiple entities with identical names	Two 'John Smith' in same org	Unique constraints, disambiguation
Historical data	Very old data accessed	Accessing records from 2005	Schema migration, data format changes

The 10/10/10 Rule

When reviewing a design, ask what happens when data is:

10× expected size: Does the system handle unusually large entities?
10× expected volume: Does the system handle 10× the anticipated load?
10× expected age: Does the system handle data that's been around for 10 years?

Data Quality Edge Cases

•Duplicate data — Same record imported twice; how is deduplication handled?
•Orphaned data — References pointing to deleted entities
•Inconsistent data — Related records that contradict each other
•Encoded data — HTML entities, URL encoding, Unicode normalization
•Schema violations — Data that doesn't match expected format
•Extreme values — Prices of $0, quantities of billions, future dates
•PII in unexpected places — Credit card numbers in 'notes' field

Test with Production-Like Data

Edge cases often only surface with realistic data. Synthetic test data tends to be 'too clean'—real data has encoding issues, inconsistencies, and patterns that no one anticipated. Where possible, test with anonymized production data or generated data that mimics production characteristics.

Sequence and State Edge Cases

Stateful systems have edge cases related to state transitions—what happens when operations occur in unexpected sequences, when states are ambiguous, or when transitions are interrupted?

State Machine Analysis

Every entity with lifecycle states needs a formal state machine. This surfaces edge cases that informal thinking misses.

Converting Mermaid diagram...

State Edge Case Questions

State Machine Questions

•What are all valid states? — Are there any hidden or implied states?
•What transitions are valid? — Can you go from any state to any other state?
•What happens on invalid transitions? — Error? Ignore? Queue for later?
•What if the same transition is requested twice? — Idempotent? Error? Different behavior?
•What if a transition is interrupted midway? — Transaction rollback? Stuck state?
•Who can trigger each transition? — User? System? Admin? External service?
•Is there a timeout for any state? — Order stuck in 'Processing' for 24 hours?

Out-of-Order Events

In event-driven systems, events may arrive out of chronological order. Your design must specify behavior:

Out-of-Order Event Handling Strategies
Strategy	Description	When to Use	Trade-off
Reorder buffer	Hold events until sequence is complete	Critical ordering, low volume	Latency, memory, timeout complexity
Process anyway	Handle each event independently	Idempotent operations, eventual consistency OK	May need reconciliation
Last-write-wins	Newest timestamp wins	Low-conflict updates	Can lose intermediate changes
State machine guard	Only accept valid transitions from current state	Strict state integrity	Needs dead-letter handling for rejected events
Event sourcing	Record all events, compute state on read	Audit requirements, complex workflows	Read complexity, storage costs

State Recovery

Every state machine should have an answer to: 'If this entity has been stuck in state X for Y hours, what happens?' Stuck states indicate failures that need either automatic recovery, manual intervention, or alerting. Entities stuck in intermediate states indefinitely are a common source of data integrity issues.

Summary: Edge Case Handling

Edge case handling transforms fragile designs into robust ones. By systematically exploring boundaries, null cases, concurrency scenarios, temporal issues, data patterns, and state transitions, you discover the bugs that would otherwise surface in production.

Key Takeaways

•Edge cases cluster into categories — Boundaries, null/empty, concurrency, time, data patterns, state
•Boundary values reveal validation bugs — Test at minimums, maximums, zeros, and limits
•Null handling must be explicit — Every field, return value, and collection can be absent
•Concurrency bugs depend on timing — Design for idempotency and use proper synchronization
•Time is hard in distributed systems — Use UTC, handle offsets explicitly, trust one time source
•Data patterns in production are messy — Plan for viral content, batch imports, and old data
•State machines surface transition bugs — Formally model states and valid transitions
•At scale, rare events are common — One-in-a-million happens thousands of times daily

What's Next

With requirements verified, bottlenecks analyzed, failure scenarios tested, and edge cases handled, the final step is to synthesize everything into a coherent design summary. The next page covers how to document and present your validated design in a way that communicates its key decisions, trade-offs, and remaining risks.

Page Complete

You now understand how to systematically identify and handle edge cases in system designs. You can analyze boundary conditions, null/empty states, concurrency scenarios, temporal issues, data patterns, and state transitions. Next, we'll examine how to synthesize your validated design into a compelling summary.

Edge Case Handling

Where Normal Ends and Weird Begins

What You Will Master

The Edge Case Taxonomy

Edge cases cluster into recognizable categories. Understanding these categories helps you systematically search for them rather than hoping to stumble upon them.

The Six Categories of Edge Cases

Edge Case Classification
Category	Description	Examples	Design Questions
Boundary values	Inputs at the limits of valid ranges	0, MAX_INT, empty string, exactly-at-limit	What happens at zero? At maximum? Just over the limit?
Empty/null data	Missing or absent values	Null fields, empty lists, missing records	Can every field be null? What does empty mean?
Large-scale data	Unexpectedly large inputs	Million-item lists, GB-sized files, viral content	Is there a size limit? What happens when exceeded?
Concurrent access	Multiple actors affecting same resources	Race conditions, double-submit, simultaneous edits	What happens if two requests arrive at once?
Temporal anomalies	Time-related edge conditions	Clock skew, leap seconds, timezone boundaries	What if clocks are wrong? What about time zones?
Exceptional sequences	Unusual ordering of operations	Out-of-order events, repeated requests, skipped steps	What if steps happen out of order?

Why Edge Cases Matter at Design Time

Many edge cases can't be fixed after the fact—they require architectural changes:

Handling billion-record tables requires partitioning designed upfront
Preventing race conditions requires proper locking strategies in the data model
Processing out-of-order events requires idempotent operations and proper state machines
Supporting leap seconds requires decisions about timestamp representation

These aren't bugs you can patch—they're architectural properties that must be designed in from the start.

The Production Reality

Boundary Value Analysis

The Boundary Points

For any numeric or sized constraint, test at:

Minimum valid: The smallest acceptable value
Below minimum: Just under the minimum (should be rejected)
Maximum valid: The largest acceptable value
Above maximum: Just over the maximum (should be rejected)
Zero: Often a special case
One: Often a special case (single item vs. collection)
Power-of-two boundaries: Where data structures resize

Boundary Analysis Example: Order Quantity
Boundary	Value	Expected Behavior	Design Consideration
Below minimum	0	Reject: cannot order zero items	Validation message, what about cart display?
Minimum valid	1	Accept: minimum order	Is pricing different for single items?
Typical	3	Accept: normal order	Standard processing path
Near maximum	99	Accept: large order	Inventory check, stock availability
Maximum valid	100	Accept: maximum single order	Why this limit? Is it configurable?
Above maximum	101	Reject: over order limit	Clear error message, suggest multiple orders?
Way over	1,000,000	Reject: obvious abuse	Should this trigger fraud detection?

String and Text Boundaries

Strings have their own boundary conditions that often cause more trouble than numeric limits:

String Edge Cases

•Empty string — Is "" valid? Different from null?
•Whitespace only — Is " " valid? Should it be trimmed?
•Maximum length — What's the limit? What happens at limit+1?
•Very long strings — Megabytes of text pasted in a 'name' field
•Unicode edge cases — Multi-byte characters, RTL text, emoji, ZWJ sequences
•Null bytes — Can corrupt C-based systems
•Line breaks — Single-line field with \n pasted
•Special characters — SQL injection, XSS, format string attacks
•Non-printable characters — Control characters, escape sequences

The Zalgo Text Test

Empty and Null Handling

The Null Spectrum

Types of 'Missing' Data
Type	Meaning	Example	Design Decision
Null	Value unknown or not applicable	User phone number not provided	How to display? Filter in queries?
Empty string	Value explicitly set to nothing	User cleared their bio	Different from null? How to distinguish?
Empty collection	Collection with zero items	User has no orders yet	Different from null collection? Display message?
Default value	Value was never set, using default	Account uses default settings	Is there a sentinel value problem?
Tombstone	Value was deleted	User deleted their profile picture	Soft delete vs. hard delete logic
Not yet loaded	Value exists but hasn't been fetched	Lazy-loaded relationship	Loading indicators, error handling

Empty Collection Edge Cases

Empty collections cause subtle bugs that are easy to miss during design review:

empty-collection-edge-cases.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
// Empty collection edge cases in system operations
 
// ❌ Division by zero when calculating average
function calculateAverageOrderValue(orders: Order[]): number {
  const total = orders.reduce((sum, o) => sum + o.value, 0);
  return total / orders.length; // NaN when orders is empty!
}
 
// ✅ Handle empty case explicitly
function calculateAverageOrderValueSafe(orders: Order[]): number | null {
  if (orders.length === 0) return null;
  const total = orders.reduce((sum, o) => sum + o.value, 0);
  return total / orders.length;
}
 
// ❌ Assumes at least one item
function getTopRatedProduct(products: Product[]): Product {
  return products.sort((a, b) => b.rating - a.rating)[0]; // undefined when empty!
}
 
// ✅ Return type reflects possibility of no result
function getTopRatedProductSafe(products: Product[]): Product | null {
  if (products.length === 0) return null;
  return products.sort((a, b) => b.rating - a.rating)[0];
}
 
// ❌ Empty result in database operation
async function getRecentOrders(userId: string): Promise<Order[]> {
  const orders = await db.query('SELECT * FROM orders WHERE user_id = ?', [userId]);
  // What if orders is empty? No error, but caller might assume at least one.
  return orders;
}
 
// ✅ Caller knows empty is a valid state
async function getRecentOrdersSafe(userId: string): Promise<{
  orders: Order[];
  hasOrders: boolean;
}> {
  const orders = await db.query('SELECT * FROM orders WHERE user_id = ?', [userId]);
  return {
    orders,
    hasOrders: orders.length > 0,
  };
}
 
// Edge case: Empty result in aggregation
// ❌ This SQL returns no rows, not a row with 0
// SELECT SUM(value) FROM orders WHERE user_id = 'new_user';
// Result: null, not 0
 
// ✅ Design must account for this
async function getTotalOrderValue(userId: string): Promise<number> {
  const result = await db.query(
    'SELECT COALESCE(SUM(value), 0) as total FROM orders WHERE user_id = ?',
    [userId]
  );
  return result[0].total;
}

API Design Implication

Concurrency Edge Cases

The TOCTOU Gap (Time-of-Check to Time-of-Use)

The most common concurrency bug pattern: you check a condition, then act on it, but the condition changed between check and action.

Converting Mermaid diagram...

Common Concurrency Edge Cases

Concurrency Scenarios to Address
Scenario	Example	Symptom	Solution Pattern
Double submit	User clicks 'Submit' twice quickly	Duplicate orders created	Idempotency keys, dedup window
Lost update	Two users edit same document	Second save overwrites first	Optimistic locking (version numbers)
Phantom read	Read returns different results mid-transaction	Inconsistent data processing	Serializable isolation or snapshot reads
Dirty read	Read uncommitted data that gets rolled back	Actions based on phantom data	Read-committed isolation minimum
Write amplification	N clients simultaneously retry writes	N× database load	Exponential backoff with jitter
Thundering herd	Cache expires, N clients hit database	Database overload on cache miss	Cache warming, request coalescing

Anti-patterns

•Check-then-act without locks
•Assuming sequential execution
•Relying on API ordering
•Unbounded retry loops
•Shared mutable state across instances

Patterns to Apply

•Atomic operations (CAS, increment)
•Pessimistic locking for write-heavy
•Optimistic locking for read-heavy
•Idempotent operations everywhere
•Event sourcing for audit trails

Design for Idempotency

Temporal Edge Cases

Time in Distributed Systems

Different nodes have different times. Network delays mean event ordering is ambiguous. The design must specify how time is handled.

Temporal Edge Cases
Edge Case	Problem	Impact	Design Consideration
Clock skew	Node A's clock is 5 seconds ahead of Node B	'Later' event appears earlier	Use logical clocks (Lamport), or trust one time source
Clock drift	Node clock slowly diverges from true time	Timeouts and TTLs become unreliable	NTP sync, bound acceptable drift
Daylight saving	2 AM happens twice, or not at all	Scheduled jobs fire twice or not at all	Store times in UTC, convert at display only
Leap seconds	23:59:60 exists	Time comparison: 23:59:60 < 00:00:00?	Use TAI or 'smear' leap seconds
Time zone changes	Government changes offset	Historical data becomes misinterpreted	Store offset with timestamp, or use UTC
Midnight boundary	Date rolls over at different times globally	'Today' means different things	Explicit time zone in all date logic

Event Ordering Challenges

In distributed systems, you cannot rely on timestamps for ordering. Messages sent 'later' may arrive 'earlier' due to network delays.

Temporal Design Questions

•What happens if events arrive out of order? — Buffer and reorder? Process anyway? Error?
•How are timestamps generated? — Client time? Server time? Which server?
•What time zone are times stored in? — UTC everywhere is safest
•How are durations calculated? — Cross-DST calculations, cross-timezone spans
•What happens at year/month/day boundaries? — Batch jobs, retention policies
•How is 'now' determined? — In tests? In async processing? In UI display?

temporal-edge-cases.ts
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// Temporal edge case examples
 
// ❌ This is a latent bug waiting to happen
function calculateAge(birthDate: Date): number {
  const today = new Date();
  return today.getFullYear() - birthDate.getFullYear();
  // Wrong! Someone born Dec 31, 2000 isn't 24 on Jan 1, 2024
}
 
// ✅ Account for whether birthday has occurred this year
function calculateAgeSafe(birthDate: Date): number {
  const today = new Date();
  let age = today.getFullYear() - birthDate.getFullYear();
  const monthDiff = today.getMonth() - birthDate.getMonth();
  if (monthDiff < 0 || (monthDiff === 0 && today.getDate() < birthDate.getDate())) {
    age--;
  }
  return age;
}
 
// ❌ Timezone nightmare waiting to happen
function isWeekend(date: Date): boolean {
  const day = date.getDay();
  return day === 0 || day === 6;
  // But whose weekend? User's timezone? Server's timezone?
}
 
// ✅ Explicit about timezone
function isWeekendInTimezone(date: Date, timezone: string): boolean {
  const options: Intl.DateTimeFormatOptions = { 
    weekday: 'long', 
    timeZone: timezone 
  };
  const dayName = date.toLocaleDateString('en-US', options);
  return dayName === 'Saturday' || dayName === 'Sunday';
}
 
// ❌ Daylight saving time trap
function addDays(date: Date, days: number): Date {
  return new Date(date.getTime() + days * 24 * 60 * 60 * 1000);
  // Breaks when crossing DST boundary! 
  // Adding "1 day" on DST change day gives 23 or 25 hours
}
 
// ✅ Use a proper datetime library
import { addDays as addDaysFns } from 'date-fns';
import { zonedTimeToUtc, utcToZonedTime } from 'date-fns-tz';
 
function addDaysInTimezone(date: Date, days: number, tz: string): Date {
  const zonedDate = utcToZonedTime(date, tz);
  const newZonedDate = addDaysFns(zonedDate, days);
  return zonedTimeToUtc(newZonedDate, tz);
}

The Leap Year Bug

Data Pattern Edge Cases

Users and systems produce data in patterns that developers don't anticipate. Understanding these patterns helps surface edge cases during design.

Unexpected Data Patterns

Data Pattern Edge Cases
Pattern	Description	Example	Design Consideration
Viral content	Single item accessed millions of times	Trending post, breaking news	Cache strategy, hot key handling
Celebrity users	Users with extreme follower counts	User with 100M followers posts	Fan-out strategy, async processing
Batch imports	Large volume inserted at once	Customer uploads 1M records via CSV	Rate limiting, background processing
Delete cascades	Deletion triggers massive cleanup	Deleting user with 10 years of history	Async deletion, soft deletes
Circular references	Data references itself	User 'manages' themselves, circular org chart	Cycle detection, depth limits
Name collisions	Multiple entities with identical names	Two 'John Smith' in same org	Unique constraints, disambiguation
Historical data	Very old data accessed	Accessing records from 2005	Schema migration, data format changes

The 10/10/10 Rule

When reviewing a design, ask what happens when data is:

10× expected size: Does the system handle unusually large entities?
10× expected volume: Does the system handle 10× the anticipated load?
10× expected age: Does the system handle data that's been around for 10 years?

Data Quality Edge Cases

•Duplicate data — Same record imported twice; how is deduplication handled?
•Orphaned data — References pointing to deleted entities
•Inconsistent data — Related records that contradict each other
•Encoded data — HTML entities, URL encoding, Unicode normalization
•Schema violations — Data that doesn't match expected format
•Extreme values — Prices of $0, quantities of billions, future dates
•PII in unexpected places — Credit card numbers in 'notes' field

Test with Production-Like Data

Sequence and State Edge Cases

Stateful systems have edge cases related to state transitions—what happens when operations occur in unexpected sequences, when states are ambiguous, or when transitions are interrupted?

State Machine Analysis

Every entity with lifecycle states needs a formal state machine. This surfaces edge cases that informal thinking misses.

Converting Mermaid diagram...

State Edge Case Questions

State Machine Questions

•What are all valid states? — Are there any hidden or implied states?
•What transitions are valid? — Can you go from any state to any other state?
•What happens on invalid transitions? — Error? Ignore? Queue for later?
•What if the same transition is requested twice? — Idempotent? Error? Different behavior?
•What if a transition is interrupted midway? — Transaction rollback? Stuck state?
•Who can trigger each transition? — User? System? Admin? External service?
•Is there a timeout for any state? — Order stuck in 'Processing' for 24 hours?

Out-of-Order Events

In event-driven systems, events may arrive out of chronological order. Your design must specify behavior:

Out-of-Order Event Handling Strategies
Strategy	Description	When to Use	Trade-off
Reorder buffer	Hold events until sequence is complete	Critical ordering, low volume	Latency, memory, timeout complexity
Process anyway	Handle each event independently	Idempotent operations, eventual consistency OK	May need reconciliation
Last-write-wins	Newest timestamp wins	Low-conflict updates	Can lose intermediate changes
State machine guard	Only accept valid transitions from current state	Strict state integrity	Needs dead-letter handling for rejected events
Event sourcing	Record all events, compute state on read	Audit requirements, complex workflows	Read complexity, storage costs

State Recovery

Summary: Edge Case Handling

Key Takeaways

•Edge cases cluster into categories — Boundaries, null/empty, concurrency, time, data patterns, state
•Boundary values reveal validation bugs — Test at minimums, maximums, zeros, and limits
•Null handling must be explicit — Every field, return value, and collection can be absent
•Concurrency bugs depend on timing — Design for idempotency and use proper synchronization
•Time is hard in distributed systems — Use UTC, handle offsets explicitly, trust one time source
•Data patterns in production are messy — Plan for viral content, batch imports, and old data
•State machines surface transition bugs — Formally model states and valid transitions
•At scale, rare events are common — One-in-a-million happens thousands of times daily

What's Next

Page Complete