System Design (HLD)Event Schema Evolution

Event Schema Evolution

LevelAdvanced

Duration90 mins

TopicEvent Schema Evolution

2 / 5

Backward Compatibility

The Consumer Protection Principle

In event-driven systems, backward compatibility is the principle that ensures new versions of events can be read by consumers built for old versions. It's the producer's promise to consumers: "I will evolve, but I won't break you."

This might seem like a constraint, but it's actually a liberating force. When backward compatibility is guaranteed, producers can deploy independently. There's no need to coordinate rollouts across dozens of consuming services. No more 2 AM deployment windows to update everything simultaneously. No more "flag day" migrations that risk the entire system.

Backward compatibility is not just a technical property—it's an organizational enabler that makes microservices practical at scale.

What You Will Learn

By the end of this page, you will master the specific techniques for maintaining backward compatibility: additive-only changes, optional fields with defaults, field aliasing, and more. You'll also learn to recognize and avoid changes that break backward compatibility, even when they seem innocuous.

Understanding Backward Compatibility

Backward compatibility means that consumers using an older schema version can successfully process events produced with a newer schema version. The "reader" code is older than the "writer" code.

The mental model:

Imagine a time-traveling message. An event produced today (with the latest schema) is consumed by a service deployed six months ago (with an old schema). For backward compatibility, that service must:

Successfully deserialize the event (no parsing failures)
Access all fields it expects to find (no missing data)
Ignore fields it doesn't know about (no crashes on extras)
Process the event correctly (no semantic errors)

Converting Mermaid diagram...

The asymmetry of compatibility:

Backward compatibility is asymmetric—it protects consumers but doesn't constrain them. A consumer can choose to:

Ignore new fields entirely (simplest approach)
Optionally use new fields if present for enhanced functionality
Upgrade to support new schema when convenient

The key insight is that consumers control their upgrade timeline. They're never forced to change by producer evolution.

The Safety Net

Backward compatibility is your safety net during deployments. If a producer deploys a schema update and something goes wrong, consumers continue operating normally on the new events. You have time to fix issues without firefighting across multiple services.

Additive-Only Changes

The golden rule of backward compatibility is additive-only changes. You can add new things; you cannot remove or modify existing things.

Safe additive changes:

Always Backward Compatible

•Add new optional field — Old consumers ignore it; no impact on existing logic
•Add new event type — Old consumers don't subscribe; no processing attempted
•Add new optional nested object — Treated as unknown field at top level
•Add enum value (with care) — Old consumers may receive unknown value; must handle gracefully
•Add field alias — Old consumers use original name; new consumers can use either
•Add documentation/metadata — No runtime impact whatsoever

additive-changes.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// ORIGINAL SCHEMA (v1)
interface OrderCreatedV1 {
  orderId: string;
  customerId: string;
  items: OrderItem[];
  totalAmount: number;
}
 
// EVOLVED SCHEMA (v2) - Backward Compatible
interface OrderCreatedV2 {
  orderId: string;
  customerId: string;
  items: OrderItem[];
  totalAmount: number;
  
  // NEW: Optional fields - old consumers ignore
  currency?: string;              // Added in v2.0
  shippingAddress?: Address;      // Added in v2.0
  estimatedDelivery?: string;     // Added in v2.1
  loyaltyPointsEarned?: number;   // Added in v2.2
}
 
// Old consumer (built for v1) processing v2 event:
function processOrderV1(event: OrderCreatedV1) {
  // This code works perfectly with v2 events!
  // Extra fields (currency, shippingAddress, etc.) are simply ignored
  // by the type system and runtime JSON parsing.
  
  console.log(`Order ${event.orderId}: $${event.totalAmount}`);
  sendConfirmation(event.customerId, event.orderId);
}

The 'Tolerant Reader' Pattern

Design consumers as 'tolerant readers' that extract only the fields they need and ignore everything else. Libraries like Jackson (Java) and Pydantic (Python) support this with settings like 'ignore unknown properties'. This future-proofs consumers against additive changes they don't yet know about.

Optional Fields and Defaults

When adding new fields, making them optional with sensible defaults is the key to backward compatibility. But "optional" has nuances across different serialization formats and programming languages.

The default value contract:

A default value answers the question: "What should an old consumer assume when this field is absent?" The answer must be semantically meaningful—not just technically valid.

Default Value Strategies
Field Type	Good Default	Why It Works	Bad Default
Currency code	"USD"	Explicit default; existing data was USD-based	"" (empty string)
Feature flag	false	Absence means feature not enabled	true (would change behavior)
Timestamp	null	Absence means 'not recorded'	epoch time (misleading)
Count/quantity	0	Absence means 'none counted'	-1 (sentinel; error-prone)
Priority level	"normal"	Sensible middle ground	"high" (changes behavior)
Array/list	[]	Empty collection; no items	null (NPE risk)

defaults-by-format.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
// JSON Schema with defaults
{
  "type": "object",
  "properties": {
    "orderId": { "type": "string" },
    "priority": { 
      "type": "string", 
      "enum": ["low", "normal", "high", "urgent"],
      "default": "normal"  // Explicit default
    },
    "expressShipping": { 
      "type": "boolean",
      "default": false     // Safe default: feature off
    },
    "metadata": {
      "type": "object",
      "default": {}        // Empty object, not null
    }
  },
  "required": ["orderId"]  // Only truly required fields
}
 
// Avro schema with defaults
{
  "type": "record",
  "name": "OrderCreated",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "priority", "type": "string", "default": "normal"},
    {"name": "expressShipping", "type": "boolean", "default": false},
    // Union with null as first type = optional with null default
    {"name": "specialInstructions", "type": ["null", "string"], "default": null}
  ]
}
 
// Protocol Buffers (proto3 has implicit defaults)
message OrderCreated {
  string order_id = 1;
  
  // proto3: Enums default to first value (0)
  Priority priority = 2;  // Will be PRIORITY_NORMAL if not set
  
  // proto3: booleans default to false
  bool express_shipping = 3;
  
  // Optional wrapper for explicit null handling
  google.protobuf.StringValue special_instructions = 4;
}
 
enum Priority {
  PRIORITY_UNSPECIFIED = 0;  // Default/unknown
  PRIORITY_LOW = 1;
  PRIORITY_NORMAL = 2;
  PRIORITY_HIGH = 3;
  PRIORITY_URGENT = 4;
}

The Null vs. Absent Distinction

Some formats distinguish between 'field is null' and 'field is absent'. JSON conflates these; Avro separates them; Protobuf uses wrapper types. Understand your format's semantics. A field defaulting to null might still require null-handling code in consumers.

Field Aliasing and Renaming

Renaming fields seems simple but is one of the most dangerous changes for backward compatibility. A renamed field appears as a removal (old name) plus an addition (new name)—breaking consumers expecting the old name.

Safe renaming via aliasing:

The solution is field aliasing: the new schema supports both old and new names, allowing gradual migration.

Breaking Rename

•v1: customerEmail
•v2: email (renamed)
•Old consumers: Field not found!
•Result: Deserialization failure

Safe Rename via Alias

•v1: customerEmail
•v2: email (alias: customerEmail)
•Old consumers: Find customerEmail ✓
•New consumers: Find email ✓

field-aliasing.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
// Avro schema with aliases
{
  "type": "record",
  "name": "CustomerUpdated",
  "fields": [
    {"name": "customerId", "type": "string"},
    {
      "name": "email",
      "type": "string",
      "aliases": ["customerEmail", "emailAddress"]  // Supports old names
    },
    {
      "name": "fullName",
      "type": "string",
      "aliases": ["name", "customerName"]
    }
  ]
}
 
// Implementation pattern: Producer writes both during transition
class CustomerEventProducer {
  async publishCustomerUpdated(customer: Customer) {
    await this.publish({
      customerId: customer.id,
      
      // Write BOTH old and new field names during migration
      email: customer.email,
      customerEmail: customer.email,  // Deprecated; kept for compatibility
      
      fullName: customer.name,
      name: customer.name,            // Deprecated; kept for compatibility
    });
  }
}
 
// Consumer using tolerant reader pattern
class CustomerConsumer {
  processEvent(event: any) {
    // Try new name first, fall back to old name
    const email = event.email ?? event.customerEmail ?? event.emailAddress;
    const name = event.fullName ?? event.name ?? event.customerName;
    
    if (!email || !name) {
      throw new Error('Required field missing after alias resolution');
    }
    
    return { email, name };
  }
}

Migration timeline for field renames:

Phase 1 (v2.0): Add new field, keep old field, write both
Phase 2 (v2.1): Consumers migrate to read new field
Phase 3 (v3.0): Stop writing old field (MAJOR version bump)
Phase 4: Remove old field from schema (breaking for historical replay)

Alias Accumulation

Aliases accumulate over time. A field might have 3-4 historical names. This is fine—schema registries track them, and serialization frameworks resolve them automatically. The bloat is in the schema, not the wire format.

Breaking Changes to Avoid

Certain schema changes cannot be made backward compatible. Understanding these is crucial—they require major version bumps and coordinated consumer migration.

Inherently breaking changes:

Breaking Changes

•Remove required field — Old consumers fail to find expected data; deserialization throws
•Change field type incompatibly — string to int fails for '123.45'; int to string changes comparison semantics
•Narrow type constraints — Reducing int64 to int32 overflows; restricting enum values removes valid options
•Change field semantics — Same field, different meaning; consumers process incorrectly
•Remove enum value — Historical events become invalid; exhaustive switches fail
•Make optional field required — Old events (without field) become invalid

Breaking Change Analysis
Change	Why It Breaks	Workaround
Remove field 'discount'	Consumer code: order.discount.percentage crashes	Deprecate → sunset → remove in MAJOR
int → string for 'quantity'	Consumer: total = quantity * price fails	Add new field; deprecate old
Remove enum value 'PENDING'	Historical events with PENDING become invalid	Never remove; mark deprecated
'amount' means USD → EUR	Consumer calculations produce wrong results	New field 'amountEur'; keep 'amount' as USD

breaking-change-detection.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
// Schema compatibility checker (pseudo-code)
interface CompatibilityResult {
  compatible: boolean;
  breakingChanges: BreakingChange[];
}
 
function checkBackwardCompatibility(
  oldSchema: Schema,
  newSchema: Schema
): CompatibilityResult {
  const breaking: BreakingChange[] = [];
  
  // Check for removed fields
  for (const field of oldSchema.requiredFields) {
    if (!newSchema.hasField(field.name)) {
      breaking.push({
        type: 'FIELD_REMOVED',
        field: field.name,
        severity: 'CRITICAL',
        message: `Required field '${field.name}' removed`,
      });
    }
  }
  
  // Check for type changes
  for (const [name, oldType] of oldSchema.fields) {
    const newType = newSchema.getField(name)?.type;
    if (newType && !isTypeCompatible(oldType, newType)) {
      breaking.push({
        type: 'TYPE_CHANGED',
        field: name,
        oldType: oldType.toString(),
        newType: newType.toString(),
        severity: 'CRITICAL',
      });
    }
  }
  
  // Check for enum value removal
  for (const [name, oldEnum] of oldSchema.enums) {
    const newEnum = newSchema.getEnum(name);
    const removedValues = oldEnum.values.filter(
      v => !newEnum?.values.includes(v)
    );
    if (removedValues.length > 0) {
      breaking.push({
        type: 'ENUM_VALUES_REMOVED',
        enum: name,
        removed: removedValues,
        severity: 'CRITICAL',
      });
    }
  }
  
  return {
    compatible: breaking.length === 0,
    breakingChanges: breaking,
  };
}

The Semantic Change Trap

The most insidious breaking change is semantic change without structural change. If 'amount' used to be in cents and now is in dollars, no schema checker will catch this. The schema looks identical; the meaning is completely different. Document semantics rigorously and treat semantic changes as breaking.

Handling Enum Evolution

Enums are particularly tricky for backward compatibility. Adding enum values is safe from the producer side, but consumers may not handle unknown values correctly.

The enum evolution problem:

When a producer adds a new enum value and an old consumer receives it:

Strict deserializers reject the event as invalid (failure)
Lenient deserializers might use a default/unknown value (data loss)
Pass-through code works fine (value is just a string)

The consumer's behavior depends on its implementation—which is outside the producer's control.

enum-evolution.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
// PRODUCER: Adding new order status
enum OrderStatusV1 {
  PENDING = 'PENDING',
  PROCESSING = 'PROCESSING',
  SHIPPED = 'SHIPPED',
  DELIVERED = 'DELIVERED',
}
 
enum OrderStatusV2 {
  PENDING = 'PENDING',
  PROCESSING = 'PROCESSING',
  SHIPPED = 'SHIPPED',
  DELIVERED = 'DELIVERED',
  RETURNED = 'RETURNED',     // NEW in v2
  REFUNDED = 'REFUNDED',     // NEW in v2
}
 
// CONSUMER: How different implementations handle unknown enum
// Pattern 1: BREAKS - Strict enum validation
function handleStatusStrict(status: OrderStatusV1) {
  // TypeScript/Java strict enums throw on unknown value
  // Consumer crashes when receiving 'RETURNED'
}
 
// Pattern 2: WORKS - String with validation
function handleStatusFlexible(status: string) {
  switch (status) {
    case 'PENDING':
      return processNewOrder();
    case 'PROCESSING':
      return processPackaging();
    case 'SHIPPED':
      return trackShipment();
    case 'DELIVERED':
      return completeOrder();
    default:
      // Unknown status - log and handle gracefully
      console.warn(`Unknown order status: ${status}`);
      return handleUnknownStatus(status);
  }
}
 
// Pattern 3: RECOMMENDED - Explicit unknown handling
interface OrderStatus {
  known: KnownOrderStatus | null;
  raw: string;
}
 
function parseOrderStatus(value: string): OrderStatus {
  const knownStatuses = ['PENDING', 'PROCESSING', 'SHIPPED', 'DELIVERED'];
  return {
    known: knownStatuses.includes(value) ? value as KnownOrderStatus : null,
    raw: value,  // Always preserve original
  };
}
 
// Consumer can handle known statuses specifically,
// while still preserving unknown statuses for logging/forwarding

Enum Evolution Best Practices

•Always include UNKNOWN/UNSPECIFIED — First enum value (0 in protobuf) should be unknown placeholder
•Consumers should handle unknown — Never use exhaustive switches without default case
•Treat enums as strings at boundaries — Validate internally; pass through at edges
•Never remove enum values — Deprecate and document; historical events need them
•Document new values — Consumers need to know semantics to decide handling

The Open/Closed Enum Pattern

Design enums as 'open' (extensible) rather than 'closed' (fixed set). The consumer should assume new values might appear and handle them gracefully—typically by logging and applying default behavior. This makes enum additions truly backward compatible.

Testing Backward Compatibility

Backward compatibility must be tested, not assumed. Manual review misses edge cases. Automated testing catches regressions before they reach production.

Testing strategies:

Schema registry compatibility checks validate that new schemas are compatible with previous versions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Confluent Schema Registry compatibility check
curl -X POST -H "Content-Type: application/json" \
  --data '{"schema": "<new-schema-json>"}' \
  http://schema-registry:8081/compatibility/subjects/order-created/versions/latest
 
# Response indicates compatibility
{
  "is_compatible": true
}
 
# CI/CD integration
- name: Check Schema Compatibility
  run: |
    RESULT=$(curl -s -X POST ...)
    if [ "$(echo $RESULT | jq .is_compatible)" != "true" ]; then
      echo "Schema is NOT backward compatible!"
      exit 1
    fi

Test with Real Historical Data

Synthetic test events often miss edge cases that exist in production. Consider periodic testing with anonymized samples of real historical events. This catches issues with field combinations, edge values, and legacy quirks that don't appear in generated test data.

Consumer Implementation Patterns

Backward compatibility is a two-way street. While producers must make compatible changes, consumers must be implemented to handle evolution gracefully.

Defensive consumer patterns:

defensive-consumer.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
// Pattern 1: Tolerant Reader with Fallbacks
class OrderConsumer {
  process(event: unknown): ProcessedOrder {
    // Parse without strict typing
    const raw = event as Record<string, unknown>;
    
    return {
      // Required fields - fail if missing
      orderId: this.requireString(raw, 'orderId'),
      customerId: this.requireString(raw, 'customerId'),
      
      // Optional fields - use defaults
      currency: this.optionalString(raw, 'currency', 'USD'),
      priority: this.optionalEnum(raw, 'priority', ['low', 'normal', 'high'], 'normal'),
      
      // New fields - extract if present for future use
      metadata: this.extractUnknownFields(raw, KNOWN_FIELDS),
    };
  }
  
  private requireString(obj: Record<string, unknown>, field: string): string {
    const value = obj[field];
    if (typeof value !== 'string' || !value) {
      throw new MissingFieldError(field);
    }
    return value;
  }
  
  private optionalString(
    obj: Record<string, unknown>,
    field: string,
    defaultValue: string
  ): string {
    const value = obj[field];
    return typeof value === 'string' ? value : defaultValue;
  }
}
 
// Pattern 2: Version-Aware Router
class VersionAwareConsumer {
  private handlers = new Map<string, EventHandler>([
    ['1.x', new V1Handler()],
    ['2.x', new V2Handler()],
  ]);
  
  async process(event: VersionedEvent): Promise<void> {
    const version = event.schemaVersion ?? '1.0.0';
    const majorVersion = version.split('.')[0] + '.x';
    
    const handler = this.handlers.get(majorVersion);
    if (!handler) {
      // Unknown version - log and use latest handler
      console.warn(`Unknown version ${version}, using latest handler`);
      return this.handlers.get('2.x')!.handle(event);
    }
    
    return handler.handle(event);
  }
}
 
// Pattern 3: Canonical Internal Model
interface InternalOrder {
  // Internal model is version-independent
  id: string;
  customer: CustomerRef;
  total: Money;
  shipping: ShippingInfo | null;
}
 
class OrderAdapter {
  // Each version has its own adapter to canonical model
  fromV1(event: OrderCreatedV1): InternalOrder {
    return {
      id: event.orderId,
      customer: { id: event.customerId },
      total: { amount: event.totalAmount, currency: 'USD' },
      shipping: null,  // Not available in v1
    };
  }
  
  fromV2(event: OrderCreatedV2): InternalOrder {
    return {
      id: event.orderId,
      customer: { id: event.customerId },
      total: { amount: event.totalAmount, currency: event.currency ?? 'USD' },
      shipping: event.shippingAddress ?? null,
    };
  }
}

The Canonical Model Pattern

Convert incoming events to an internal canonical model as early as possible. Business logic operates on the canonical model, not raw events. Version-specific parsing is isolated to adapters. This separates evolution concerns from business logic.

Organizational Practices for Backward Compatibility

Technical solutions alone don't guarantee backward compatibility. Organizational practices embed compatibility thinking into the development workflow.

Key practices:

Organizational Enablers

•Schema review in pull requests — Treat schema changes like API changes; require explicit review
•Automated compatibility gates — CI pipelines block incompatible schema changes
•Consumer discovery — Maintain registry of who consumes which events; notify before changes
•Deprecation communication — Slack channels, email lists, or dashboards for schema announcements
•Compatibility SLAs — Define how long old versions will be supported; enforce through policy
•Incident retrospectives — When compatibility breaks, analyze and improve process

.github/CODEOWNERS

# Schema changes require architecture team review
/schemas/                        @architecture-team
/events/                         @architecture-team @platform-team
 
# Specific high-impact schemas require broader review
/schemas/order-*                 @architecture-team @commerce-team @analytics-team
/schemas/payment-*               @architecture-team @finance-team @compliance-team

The schema change checklist:

## Schema Change Checklist

- [ ] Change is backward compatible (or MAJOR version bump justified)
- [ ] New fields have sensible defaults
- [ ] Enum additions documented for consumer impact
- [ ] Field renames include aliases
- [ ] Compatibility tests updated
- [ ] Consumer teams notified
- [ ] Deprecation timeline set (if removing/changing)
- [ ] Schema registry validation passed

Make Compatibility Visible

Track and display compatibility metrics: number of schema versions in production, consumer version distribution, deprecation countdown timers. Visibility creates accountability and helps teams prioritize migration work.

Summary: Backward Compatibility

Backward compatibility enables producer evolution without breaking consumers. Let's consolidate the key takeaways:

Key Takeaways

•Backward compatibility lets new producers work with old consumers — The foundation of independent deployability.
•Additive-only changes are always safe — Add optional fields; never remove or modify existing.
•Defaults are essential — New optional fields need sensible defaults for when absent from old events.
•Field renames require aliasing — Support both old and new names during migration.
•Some changes are inherently breaking — Removals, type changes, and semantic changes require MAJOR versions.
•Enums need special care — Consumers must handle unknown values; always include UNKNOWN placeholder.
•Test compatibility automatically — Schema validation, contract tests, and historical replay.
•Consumers must be tolerant readers — Implement defensively; ignore unknown fields; use fallbacks.
•Organizational practices matter — Code owners, CI gates, and change checklists enforce compatibility.

What's next:

Backward compatibility protects old consumers. The next page explores forward compatibility—ensuring old producers work with new consumers, which is essential for rolling deployments and consumer-ahead updates.

Page Complete

You now have a comprehensive understanding of backward compatibility in event schemas. You can implement additive changes, handle field renames safely, avoid breaking changes, and set up testing and organizational practices. Next, we'll explore the complementary concept of forward compatibility.

2 / 5

Loading learning content...

System Design (HLD)Event Schema Evolution

Event Schema Evolution

LevelAdvanced

Duration90 mins

TopicEvent Schema Evolution

2 / 5

Backward Compatibility

The Consumer Protection Principle

Backward compatibility is not just a technical property—it's an organizational enabler that makes microservices practical at scale.

What You Will Learn

Understanding Backward Compatibility

Backward compatibility means that consumers using an older schema version can successfully process events produced with a newer schema version. The "reader" code is older than the "writer" code.

The mental model:

Successfully deserialize the event (no parsing failures)
Access all fields it expects to find (no missing data)
Ignore fields it doesn't know about (no crashes on extras)
Process the event correctly (no semantic errors)

Converting Mermaid diagram...

The asymmetry of compatibility:

Backward compatibility is asymmetric—it protects consumers but doesn't constrain them. A consumer can choose to:

Ignore new fields entirely (simplest approach)
Optionally use new fields if present for enhanced functionality
Upgrade to support new schema when convenient

The key insight is that consumers control their upgrade timeline. They're never forced to change by producer evolution.

The Safety Net

Additive-Only Changes

The golden rule of backward compatibility is additive-only changes. You can add new things; you cannot remove or modify existing things.

Safe additive changes:

Always Backward Compatible

•Add new optional field — Old consumers ignore it; no impact on existing logic
•Add new event type — Old consumers don't subscribe; no processing attempted
•Add new optional nested object — Treated as unknown field at top level
•Add enum value (with care) — Old consumers may receive unknown value; must handle gracefully
•Add field alias — Old consumers use original name; new consumers can use either
•Add documentation/metadata — No runtime impact whatsoever

additive-changes.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// ORIGINAL SCHEMA (v1)
interface OrderCreatedV1 {
  orderId: string;
  customerId: string;
  items: OrderItem[];
  totalAmount: number;
}
 
// EVOLVED SCHEMA (v2) - Backward Compatible
interface OrderCreatedV2 {
  orderId: string;
  customerId: string;
  items: OrderItem[];
  totalAmount: number;
  
  // NEW: Optional fields - old consumers ignore
  currency?: string;              // Added in v2.0
  shippingAddress?: Address;      // Added in v2.0
  estimatedDelivery?: string;     // Added in v2.1
  loyaltyPointsEarned?: number;   // Added in v2.2
}
 
// Old consumer (built for v1) processing v2 event:
function processOrderV1(event: OrderCreatedV1) {
  // This code works perfectly with v2 events!
  // Extra fields (currency, shippingAddress, etc.) are simply ignored
  // by the type system and runtime JSON parsing.
  
  console.log(`Order ${event.orderId}: $${event.totalAmount}`);
  sendConfirmation(event.customerId, event.orderId);
}

The 'Tolerant Reader' Pattern

Optional Fields and Defaults

The default value contract:

A default value answers the question: "What should an old consumer assume when this field is absent?" The answer must be semantically meaningful—not just technically valid.

Default Value Strategies
Field Type	Good Default	Why It Works	Bad Default
Currency code	"USD"	Explicit default; existing data was USD-based	"" (empty string)
Feature flag	false	Absence means feature not enabled	true (would change behavior)
Timestamp	null	Absence means 'not recorded'	epoch time (misleading)
Count/quantity	0	Absence means 'none counted'	-1 (sentinel; error-prone)
Priority level	"normal"	Sensible middle ground	"high" (changes behavior)
Array/list	[]	Empty collection; no items	null (NPE risk)

defaults-by-format.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
// JSON Schema with defaults
{
  "type": "object",
  "properties": {
    "orderId": { "type": "string" },
    "priority": { 
      "type": "string", 
      "enum": ["low", "normal", "high", "urgent"],
      "default": "normal"  // Explicit default
    },
    "expressShipping": { 
      "type": "boolean",
      "default": false     // Safe default: feature off
    },
    "metadata": {
      "type": "object",
      "default": {}        // Empty object, not null
    }
  },
  "required": ["orderId"]  // Only truly required fields
}
 
// Avro schema with defaults
{
  "type": "record",
  "name": "OrderCreated",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "priority", "type": "string", "default": "normal"},
    {"name": "expressShipping", "type": "boolean", "default": false},
    // Union with null as first type = optional with null default
    {"name": "specialInstructions", "type": ["null", "string"], "default": null}
  ]
}
 
// Protocol Buffers (proto3 has implicit defaults)
message OrderCreated {
  string order_id = 1;
  
  // proto3: Enums default to first value (0)
  Priority priority = 2;  // Will be PRIORITY_NORMAL if not set
  
  // proto3: booleans default to false
  bool express_shipping = 3;
  
  // Optional wrapper for explicit null handling
  google.protobuf.StringValue special_instructions = 4;
}
 
enum Priority {
  PRIORITY_UNSPECIFIED = 0;  // Default/unknown
  PRIORITY_LOW = 1;
  PRIORITY_NORMAL = 2;
  PRIORITY_HIGH = 3;
  PRIORITY_URGENT = 4;
}

The Null vs. Absent Distinction

Field Aliasing and Renaming

Safe renaming via aliasing:

The solution is field aliasing: the new schema supports both old and new names, allowing gradual migration.

Breaking Rename

•v1: customerEmail
•v2: email (renamed)
•Old consumers: Field not found!
•Result: Deserialization failure

Safe Rename via Alias

•v1: customerEmail
•v2: email (alias: customerEmail)
•Old consumers: Find customerEmail ✓
•New consumers: Find email ✓

field-aliasing.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
// Avro schema with aliases
{
  "type": "record",
  "name": "CustomerUpdated",
  "fields": [
    {"name": "customerId", "type": "string"},
    {
      "name": "email",
      "type": "string",
      "aliases": ["customerEmail", "emailAddress"]  // Supports old names
    },
    {
      "name": "fullName",
      "type": "string",
      "aliases": ["name", "customerName"]
    }
  ]
}
 
// Implementation pattern: Producer writes both during transition
class CustomerEventProducer {
  async publishCustomerUpdated(customer: Customer) {
    await this.publish({
      customerId: customer.id,
      
      // Write BOTH old and new field names during migration
      email: customer.email,
      customerEmail: customer.email,  // Deprecated; kept for compatibility
      
      fullName: customer.name,
      name: customer.name,            // Deprecated; kept for compatibility
    });
  }
}
 
// Consumer using tolerant reader pattern
class CustomerConsumer {
  processEvent(event: any) {
    // Try new name first, fall back to old name
    const email = event.email ?? event.customerEmail ?? event.emailAddress;
    const name = event.fullName ?? event.name ?? event.customerName;
    
    if (!email || !name) {
      throw new Error('Required field missing after alias resolution');
    }
    
    return { email, name };
  }
}

Migration timeline for field renames:

Phase 1 (v2.0): Add new field, keep old field, write both
Phase 2 (v2.1): Consumers migrate to read new field
Phase 3 (v3.0): Stop writing old field (MAJOR version bump)
Phase 4: Remove old field from schema (breaking for historical replay)

Alias Accumulation

Breaking Changes to Avoid

Certain schema changes cannot be made backward compatible. Understanding these is crucial—they require major version bumps and coordinated consumer migration.

Inherently breaking changes:

Breaking Changes

•Remove required field — Old consumers fail to find expected data; deserialization throws
•Change field type incompatibly — string to int fails for '123.45'; int to string changes comparison semantics
•Narrow type constraints — Reducing int64 to int32 overflows; restricting enum values removes valid options
•Change field semantics — Same field, different meaning; consumers process incorrectly
•Remove enum value — Historical events become invalid; exhaustive switches fail
•Make optional field required — Old events (without field) become invalid

Breaking Change Analysis
Change	Why It Breaks	Workaround
Remove field 'discount'	Consumer code: order.discount.percentage crashes	Deprecate → sunset → remove in MAJOR
int → string for 'quantity'	Consumer: total = quantity * price fails	Add new field; deprecate old
Remove enum value 'PENDING'	Historical events with PENDING become invalid	Never remove; mark deprecated
'amount' means USD → EUR	Consumer calculations produce wrong results	New field 'amountEur'; keep 'amount' as USD

breaking-change-detection.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
// Schema compatibility checker (pseudo-code)
interface CompatibilityResult {
  compatible: boolean;
  breakingChanges: BreakingChange[];
}
 
function checkBackwardCompatibility(
  oldSchema: Schema,
  newSchema: Schema
): CompatibilityResult {
  const breaking: BreakingChange[] = [];
  
  // Check for removed fields
  for (const field of oldSchema.requiredFields) {
    if (!newSchema.hasField(field.name)) {
      breaking.push({
        type: 'FIELD_REMOVED',
        field: field.name,
        severity: 'CRITICAL',
        message: `Required field '${field.name}' removed`,
      });
    }
  }
  
  // Check for type changes
  for (const [name, oldType] of oldSchema.fields) {
    const newType = newSchema.getField(name)?.type;
    if (newType && !isTypeCompatible(oldType, newType)) {
      breaking.push({
        type: 'TYPE_CHANGED',
        field: name,
        oldType: oldType.toString(),
        newType: newType.toString(),
        severity: 'CRITICAL',
      });
    }
  }
  
  // Check for enum value removal
  for (const [name, oldEnum] of oldSchema.enums) {
    const newEnum = newSchema.getEnum(name);
    const removedValues = oldEnum.values.filter(
      v => !newEnum?.values.includes(v)
    );
    if (removedValues.length > 0) {
      breaking.push({
        type: 'ENUM_VALUES_REMOVED',
        enum: name,
        removed: removedValues,
        severity: 'CRITICAL',
      });
    }
  }
  
  return {
    compatible: breaking.length === 0,
    breakingChanges: breaking,
  };
}

The Semantic Change Trap

Handling Enum Evolution

Enums are particularly tricky for backward compatibility. Adding enum values is safe from the producer side, but consumers may not handle unknown values correctly.

The enum evolution problem:

When a producer adds a new enum value and an old consumer receives it:

Strict deserializers reject the event as invalid (failure)
Lenient deserializers might use a default/unknown value (data loss)
Pass-through code works fine (value is just a string)

The consumer's behavior depends on its implementation—which is outside the producer's control.

enum-evolution.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
// PRODUCER: Adding new order status
enum OrderStatusV1 {
  PENDING = 'PENDING',
  PROCESSING = 'PROCESSING',
  SHIPPED = 'SHIPPED',
  DELIVERED = 'DELIVERED',
}
 
enum OrderStatusV2 {
  PENDING = 'PENDING',
  PROCESSING = 'PROCESSING',
  SHIPPED = 'SHIPPED',
  DELIVERED = 'DELIVERED',
  RETURNED = 'RETURNED',     // NEW in v2
  REFUNDED = 'REFUNDED',     // NEW in v2
}
 
// CONSUMER: How different implementations handle unknown enum
// Pattern 1: BREAKS - Strict enum validation
function handleStatusStrict(status: OrderStatusV1) {
  // TypeScript/Java strict enums throw on unknown value
  // Consumer crashes when receiving 'RETURNED'
}
 
// Pattern 2: WORKS - String with validation
function handleStatusFlexible(status: string) {
  switch (status) {
    case 'PENDING':
      return processNewOrder();
    case 'PROCESSING':
      return processPackaging();
    case 'SHIPPED':
      return trackShipment();
    case 'DELIVERED':
      return completeOrder();
    default:
      // Unknown status - log and handle gracefully
      console.warn(`Unknown order status: ${status}`);
      return handleUnknownStatus(status);
  }
}
 
// Pattern 3: RECOMMENDED - Explicit unknown handling
interface OrderStatus {
  known: KnownOrderStatus | null;
  raw: string;
}
 
function parseOrderStatus(value: string): OrderStatus {
  const knownStatuses = ['PENDING', 'PROCESSING', 'SHIPPED', 'DELIVERED'];
  return {
    known: knownStatuses.includes(value) ? value as KnownOrderStatus : null,
    raw: value,  // Always preserve original
  };
}
 
// Consumer can handle known statuses specifically,
// while still preserving unknown statuses for logging/forwarding

Enum Evolution Best Practices

•Always include UNKNOWN/UNSPECIFIED — First enum value (0 in protobuf) should be unknown placeholder
•Consumers should handle unknown — Never use exhaustive switches without default case
•Treat enums as strings at boundaries — Validate internally; pass through at edges
•Never remove enum values — Deprecate and document; historical events need them
•Document new values — Consumers need to know semantics to decide handling

The Open/Closed Enum Pattern

Testing Backward Compatibility

Backward compatibility must be tested, not assumed. Manual review misses edge cases. Automated testing catches regressions before they reach production.

Testing strategies:

Schema registry compatibility checks validate that new schemas are compatible with previous versions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Confluent Schema Registry compatibility check
curl -X POST -H "Content-Type: application/json" \
  --data '{"schema": "<new-schema-json>"}' \
  http://schema-registry:8081/compatibility/subjects/order-created/versions/latest
 
# Response indicates compatibility
{
  "is_compatible": true
}
 
# CI/CD integration
- name: Check Schema Compatibility
  run: |
    RESULT=$(curl -s -X POST ...)
    if [ "$(echo $RESULT | jq .is_compatible)" != "true" ]; then
      echo "Schema is NOT backward compatible!"
      exit 1
    fi

Test with Real Historical Data

Consumer Implementation Patterns

Backward compatibility is a two-way street. While producers must make compatible changes, consumers must be implemented to handle evolution gracefully.

Defensive consumer patterns:

defensive-consumer.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
// Pattern 1: Tolerant Reader with Fallbacks
class OrderConsumer {
  process(event: unknown): ProcessedOrder {
    // Parse without strict typing
    const raw = event as Record<string, unknown>;
    
    return {
      // Required fields - fail if missing
      orderId: this.requireString(raw, 'orderId'),
      customerId: this.requireString(raw, 'customerId'),
      
      // Optional fields - use defaults
      currency: this.optionalString(raw, 'currency', 'USD'),
      priority: this.optionalEnum(raw, 'priority', ['low', 'normal', 'high'], 'normal'),
      
      // New fields - extract if present for future use
      metadata: this.extractUnknownFields(raw, KNOWN_FIELDS),
    };
  }
  
  private requireString(obj: Record<string, unknown>, field: string): string {
    const value = obj[field];
    if (typeof value !== 'string' || !value) {
      throw new MissingFieldError(field);
    }
    return value;
  }
  
  private optionalString(
    obj: Record<string, unknown>,
    field: string,
    defaultValue: string
  ): string {
    const value = obj[field];
    return typeof value === 'string' ? value : defaultValue;
  }
}
 
// Pattern 2: Version-Aware Router
class VersionAwareConsumer {
  private handlers = new Map<string, EventHandler>([
    ['1.x', new V1Handler()],
    ['2.x', new V2Handler()],
  ]);
  
  async process(event: VersionedEvent): Promise<void> {
    const version = event.schemaVersion ?? '1.0.0';
    const majorVersion = version.split('.')[0] + '.x';
    
    const handler = this.handlers.get(majorVersion);
    if (!handler) {
      // Unknown version - log and use latest handler
      console.warn(`Unknown version ${version}, using latest handler`);
      return this.handlers.get('2.x')!.handle(event);
    }
    
    return handler.handle(event);
  }
}
 
// Pattern 3: Canonical Internal Model
interface InternalOrder {
  // Internal model is version-independent
  id: string;
  customer: CustomerRef;
  total: Money;
  shipping: ShippingInfo | null;
}
 
class OrderAdapter {
  // Each version has its own adapter to canonical model
  fromV1(event: OrderCreatedV1): InternalOrder {
    return {
      id: event.orderId,
      customer: { id: event.customerId },
      total: { amount: event.totalAmount, currency: 'USD' },
      shipping: null,  // Not available in v1
    };
  }
  
  fromV2(event: OrderCreatedV2): InternalOrder {
    return {
      id: event.orderId,
      customer: { id: event.customerId },
      total: { amount: event.totalAmount, currency: event.currency ?? 'USD' },
      shipping: event.shippingAddress ?? null,
    };
  }
}

The Canonical Model Pattern

Organizational Practices for Backward Compatibility

Technical solutions alone don't guarantee backward compatibility. Organizational practices embed compatibility thinking into the development workflow.

Key practices:

Organizational Enablers

•Schema review in pull requests — Treat schema changes like API changes; require explicit review
•Automated compatibility gates — CI pipelines block incompatible schema changes
•Consumer discovery — Maintain registry of who consumes which events; notify before changes
•Deprecation communication — Slack channels, email lists, or dashboards for schema announcements
•Compatibility SLAs — Define how long old versions will be supported; enforce through policy
•Incident retrospectives — When compatibility breaks, analyze and improve process

.github/CODEOWNERS

# Schema changes require architecture team review
/schemas/                        @architecture-team
/events/                         @architecture-team @platform-team
 
# Specific high-impact schemas require broader review
/schemas/order-*                 @architecture-team @commerce-team @analytics-team
/schemas/payment-*               @architecture-team @finance-team @compliance-team

The schema change checklist:

## Schema Change Checklist

- [ ] Change is backward compatible (or MAJOR version bump justified)
- [ ] New fields have sensible defaults
- [ ] Enum additions documented for consumer impact
- [ ] Field renames include aliases
- [ ] Compatibility tests updated
- [ ] Consumer teams notified
- [ ] Deprecation timeline set (if removing/changing)
- [ ] Schema registry validation passed

Make Compatibility Visible

Summary: Backward Compatibility

Backward compatibility enables producer evolution without breaking consumers. Let's consolidate the key takeaways:

Key Takeaways

•Backward compatibility lets new producers work with old consumers — The foundation of independent deployability.
•Additive-only changes are always safe — Add optional fields; never remove or modify existing.
•Defaults are essential — New optional fields need sensible defaults for when absent from old events.
•Field renames require aliasing — Support both old and new names during migration.
•Some changes are inherently breaking — Removals, type changes, and semantic changes require MAJOR versions.
•Enums need special care — Consumers must handle unknown values; always include UNKNOWN placeholder.
•Test compatibility automatically — Schema validation, contract tests, and historical replay.
•Consumers must be tolerant readers — Implement defensively; ignore unknown fields; use fallbacks.
•Organizational practices matter — Code owners, CI gates, and change checklists enforce compatibility.

What's next:

Page Complete

2 / 5