System Design (HLD)Event Schema Evolution

Event Schema Evolution

LevelAdvanced

Duration90 mins

TopicEvent Schema Evolution

4 / 5

Schema Registry

The Central Source of Truth

Imagine a distributed system with 100 microservices, 500 event types, and thousands of schema versions accumulated over five years. How does a consumer know what fields to expect? How does a producer validate that its schema won't break consumers? How do you discover all the events flowing through your system?

The answer is a schema registry—a centralized service that stores, versions, and validates event schemas. It's the single source of truth for event contracts across your entire organization.

A schema registry transforms schema management from tribal knowledge ("Ask the Order team what fields they send") to discoverable infrastructure ("Query the registry for OrderCreated v2.3").

What You Will Learn

By the end of this page, you will understand what schema registries are, why they're essential for event-driven systems, how to use them effectively, and how to integrate them into your CI/CD pipelines for automated compatibility enforcement.

What is a Schema Registry?

A schema registry is a centralized service that manages the schemas for your event-driven system. It provides:

Core capabilities:

Schema Registry Functions

•Schema Storage — Persist schema definitions with unique identifiers
•Version Management — Track schema evolution over time; maintain history
•Compatibility Validation — Enforce that new schemas are compatible with old
•Schema Discovery — Allow consumers and producers to look up schemas
•Serialization Support — Provide schemas for binary serialization (Avro, Protobuf)
•Governance — Apply policies, ownership, and lifecycle management

Converting Mermaid diagram...

The schema registry in the event flow:

Producer development: Define schema, register with registry, receive schema ID
Event production: Embed schema ID in message header; serialize with schema
Consumer development: Discover schemas from registry; generate code
Event consumption: Fetch schema by ID; deserialize message
Evolution: Register new version; registry validates compatibility

Schema ID vs. Version

Schema registries typically assign both a globally unique ID (e.g., 12345) and a version number (e.g., 3) to each schema. The ID is immutable and used for runtime lookup. The version is human-readable and used for compatibility comparison.

Why Use a Schema Registry?

You could manage schemas without a registry—store them in Git, share them via documentation, embed them in events. But as systems grow, this approach collapses. Here's why registries become essential:

Problems without a registry:

Schema Management Challenges
Challenge	Without Registry	With Registry
Discovery	Search Git repos; ask on Slack	Query API; browse catalog
Version history	Git blame; no semantic versioning	First-class version tracking
Compatibility	Manual review; hope for the best	Automated validation on commit
Runtime lookup	Embed full schema in each event	Embed schema ID; fetch on demand
Governance	Documentation (often outdated)	Metadata, ownership, lifecycle
Serialization	JSON everywhere (size, no types)	Binary formats with schema

Concrete benefits:

Registry Benefits

•Compact messages — Events carry schema ID (4 bytes) instead of full schema (kilobytes)
•Type safety — Generate typed code from schemas; catch errors at compile time
•Breaking change prevention — CI blocks incompatible schemas before deployment
•Self-service discovery — Teams find schemas without asking; reduces coordination
•Audit trail — Every schema change is recorded; who, when, what
•Consistent serialization — All services use same schema; no deserialization surprises

The Rule of 10/10

Consider a schema registry when you have more than 10 event types OR more than 10 services. Below this threshold, informal processes might work. Above it, the coordination cost of informal management exceeds the operational cost of a registry.

Schema Registry Architecture

Schema registries follow a common architectural pattern, regardless of implementation:

Core components:

schema-registry-architecture.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
// Conceptual schema registry architecture
 
interface SchemaRegistry {
  // Subject: Logical grouping of schema versions (e.g., "order-created")
  // Schema: Definition in specific format (Avro, Protobuf, JSON Schema)
  // SchemaId: Globally unique identifier
  // Version: Sequential version within subject
  
  // Core CRUD operations
  registerSchema(subject: string, schema: Schema): Promise<RegisterResult>;
  getSchema(schemaId: number): Promise<Schema>;
  getSchemaByVersion(subject: string, version: number): Promise<Schema>;
  getLatestSchema(subject: string): Promise<Schema>;
  listVersions(subject: string): Promise<VersionInfo[]>;
  listSubjects(): Promise<string[]>;
  
  // Compatibility operations
  checkCompatibility(subject: string, newSchema: Schema): Promise<CompatResult>;
  getCompatibilityConfig(subject: string): Promise<CompatibilityMode>;
  setCompatibilityConfig(subject: string, mode: CompatibilityMode): Promise<void>;
  
  // Lifecycle operations  
  deleteSchema(subject: string, version: number): Promise<void>;
  deleteSubject(subject: string): Promise<void>;
}
 
interface RegisterResult {
  schemaId: number;      // Globally unique ID
  version: number;       // Version within subject
  isNew: boolean;        // false if identical schema already exists
}
 
interface VersionInfo {
  version: number;
  schemaId: number;
  createdAt: Date;
  isDeprecated: boolean;
}
 
enum CompatibilityMode {
  NONE = 'NONE',                        // No compatibility check
  BACKWARD = 'BACKWARD',                // New can read old
  BACKWARD_TRANSITIVE = 'BACKWARD_TRANSITIVE',
  FORWARD = 'FORWARD',                  // Old can read new
  FORWARD_TRANSITIVE = 'FORWARD_TRANSITIVE',
  FULL = 'FULL',                        // Both directions
  FULL_TRANSITIVE = 'FULL_TRANSITIVE',  // Both directions, all versions
}

Storage patterns:

Kafka-backed (Confluent): Schemas stored in a compacted Kafka topic; distributed reads via REST API
Database-backed (Apicurio): Schemas in PostgreSQL/MySQL; standard RDBMS features (transactions, queries)
Cloud-native (AWS Glue, Azure Schema Registry): Managed service; integrates with cloud messaging

Subject naming conventions:

Convention	Example	When to Use
Topic name	`orders-created`	Single event type per topic
Topic + type	`orders-created-value`	Kafka-style; key and value schemas
Domain.event	`commerce.OrderCreated`	Enterprise namespacing
Service.event	`order-service/OrderCreated`	Service ownership clarity

Subject Strategy

Choose a subject naming strategy early and enforce it. The subject name determines what schemas are compared for compatibility. 'TopicNameStrategy' (default) means all schemas on a topic must be compatible. 'RecordNameStrategy' allows different event types on the same topic with independent compatibility.

Popular Schema Registries

Several schema registries are available, each with different strengths:

Major options:

Confluent Schema Registry is the most widely used, especially in Kafka ecosystems.

Strengths:

Deep Kafka integration; first-class support in Kafka clients
Mature compatibility checking; battle-tested at scale
Supports Avro, Protobuf, JSON Schema
Extensive ecosystem (KSQL, Connect, etc.)

Limitations:

Requires Kafka for storage (even if using other brokers)
Commercial features require Confluent Platform license

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Register a schema with Confluent Schema Registry
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  --data '{"schema": "{"type":"record","name":"OrderCreated","fields":[...]"}' \
  http://localhost:8081/subjects/order-created-value/versions
 
# Response: {"id": 1}
 
# Fetch schema by ID
curl http://localhost:8081/schemas/ids/1
 
# Check compatibility before registering
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  --data '{"schema": "<new-schema>"}' \
  http://localhost:8081/compatibility/subjects/order-created-value/versions/latest

Choosing a Registry

If you're using Kafka heavily, Confluent is the safe choice. For broker-agnostic or multi-broker environments, Apicurio offers flexibility. For cloud-native architectures, managed registries reduce operational burden but may limit portability.

Integrating with Producers

Producers interact with the schema registry to register schemas and serialize events. The integration can be design-time (schema registered during build) or runtime (schema registered on first use).

Design-time registration (recommended):

design-time-registration.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// Design-time: Register schema during CI/CD pipeline
// gradle/maven task or npm script
 
// build.gradle.kts
plugins {
    id("com.github.imflog.kafka-schema-registry-gradle-plugin")
}
 
schemaRegistry {
    url.set("http://schema-registry:8081")
    
    register {
        subject("order-created-value", "src/main/avro/OrderCreated.avsc")
        subject("order-updated-value", "src/main/avro/OrderUpdated.avsc")
    }
    
    compatibility {
        subject("order-created-value", "src/main/avro/OrderCreated.avsc")
    }
}
 
// CI/CD pipeline
// 1. ./gradlew schemaRegistryCompatibilityCheck  (fail if incompatible)
// 2. ./gradlew schemaRegistryRegister            (register new version)
// 3. Deploy application
 
// Application uses known schema ID
const SCHEMA_ID = process.env.ORDER_CREATED_SCHEMA_ID;  // 42
 
async function publishOrder(order: Order) {
  const encoded = avroEncode(order, await getSchemaById(SCHEMA_ID));
  await kafka.send({
    topic: 'orders.created',
    messages: [{
      key: order.id,
      value: encoded,
      headers: { 'schema-id': SCHEMA_ID.toString() },
    }],
  });
}

Runtime registration (alternative):

runtime-registration.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// Runtime: Register schema on first message
// Simpler setup but less control
 
import { SchemaRegistry, AvroSerializer } from '@kafkajs/confluent-schema-registry';
 
const registry = new SchemaRegistry({
  host: 'http://schema-registry:8081',
});
 
// Serializer auto-registers schema if not exists
const serializer = new AvroSerializer(registry, {
  autoRegisterSchemas: true,  // Registers on first use
  subject: 'order-created-value',
});
 
async function publishOrder(order: Order) {
  // Serializer embeds schema ID in payload magic byte
  const encoded = await serializer.serialize(order, schema);
  
  await kafka.send({
    topic: 'orders.created',
    messages: [{
      key: order.id,
      value: encoded,  // Magic byte + schema ID + Avro data
    }],
  });
}
 
// Warning: Runtime registration risks
// - No compatibility check before production
// - Schema must be identical across producer instances
// - Network failures during registration block publishing

Design-Time is Safer

Design-time registration catches compatibility issues in CI before deployment. Runtime registration might successfully register an incompatible schema in production. Prefer design-time for production systems.

Integrating with Consumers

Consumers fetch schemas from the registry to deserialize events. This can happen on-demand (per message) or cached (fetch once, reuse).

On-demand with caching (typical pattern):

consumer-integration.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Consumer: Deserialize with schema from registry
import { SchemaRegistry, AvroDeserializer } from '@kafkajs/confluent-schema-registry';
 
const registry = new SchemaRegistry({
  host: 'http://schema-registry:8081',
  // Cache schemas to avoid repeated fetches
  schemaCache: new Map(),
  idCache: new Map(),
});
 
const deserializer = new AvroDeserializer(registry);
 
await consumer.run({
  eachMessage: async ({ message }) => {
    // Deserializer reads schema ID from message prefix
    // Fetches schema from registry (cached after first fetch)
    // Deserializes Avro data using schema
    const order = await deserializer.deserialize(message.value);
    
    await processOrder(order);
  },
});
 
// Under the hood:
// 1. Read magic byte (0x00) and schema ID (4 bytes) from message prefix
// 2. Check cache for schema ID
// 3. If not cached, fetch GET /schemas/ids/{id}
// 4. Cache schema for future messages
// 5. Deserialize remaining bytes with schema

Consumer schema evolution patterns:

consumer-evolution.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Pattern 1: Evolve with registry (specific reader schema)
const registry = new SchemaRegistry({ host: '...' });
 
// Consumer specifies reader schema (its view of the data)
const readerSchema = await registry.getLatestSchemaForSubject('order-created-value');
const deserializer = new AvroDeserializer(registry, {
  readerSchema: readerSchema,  // Use latest schema we understand
});
 
// Avro resolves differences between writer and reader schemas
// - Extra fields in writer: ignored
// - Missing fields in writer: use defaults from reader
// - Type promotions: automatic (int -> long)
 
// Pattern 2: Multiple schema versions support
class OrderConsumer {
  private schemaHandlers = new Map<number, (data: any) => Order>();
  
  async initialize() {
    // Fetch all known schema versions
    const versions = await registry.getAllVersions('order-created-value');
    
    for (const version of versions) {
      const schema = await registry.getSchemaByVersion('order-created-value', version);
      this.schemaHandlers.set(
        schema.id,
        this.createHandler(schema)
      );
    }
  }
  
  async process(message: Buffer): Promise<Order> {
    const schemaId = readSchemaId(message);  // Read from magic bytes
    const handler = this.schemaHandlers.get(schemaId);
    
    if (!handler) {
      // Unknown schema - fetch and add dynamically
      const schema = await registry.getSchema(schemaId);
      this.schemaHandlers.set(schemaId, this.createHandler(schema));
    }
    
    const rawData = deserialize(message, schema);
    return handler(rawData);
  }
}

Pre-cache Schemas at Startup

Fetch and cache relevant schemas when the consumer starts, rather than on first message. This prevents latency spikes on the first message of each schema version and fails fast if the registry is unreachable.

Compatibility Enforcement

The killer feature of schema registries is automated compatibility enforcement. The registry rejects incompatible schemas before they reach production.

How compatibility checking works:

compatibility-checking.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
// Compatibility check flow
 
// 1. Developer proposes new schema version
const newSchema = `{
  "type": "record",
  "name": "OrderCreated",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "customerId", "type": "string"},
    {"name": "totalAmount", "type": "double"},
    {"name": "currency", "type": "string"}  // NEW: Added without default!
  ]
}`;
 
// 2. Schema registry checks against existing versions
const checkResult = await registry.testCompatibility(
  'order-created-value',
  newSchema
);
 
// 3. Registry returns compatibility result
console.log(checkResult);
// {
//   isCompatible: false,
//   errors: [
//     {
//       errorType: 'READER_FIELD_MISSING_DEFAULT_VALUE',
//       message: 'Field "currency" has no default value'
//     }
//   ]
// }
 
// 4. Registration rejected
try {
  await registry.register('order-created-value', newSchema);
} catch (error) {
  // SchemaRegistryError: Schema is not compatible
}
 
// 5. Developer fixes schema (add default)
const fixedSchema = `{
  "type": "record",
  "name": "OrderCreated",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "customerId", "type": "string"},
    {"name": "totalAmount", "type": "double"},
    {"name": "currency", "type": "string", "default": "USD"}  // FIXED!
  ]
}`;
 
// 6. New check passes
const fixedResult = await registry.testCompatibility(
  'order-created-value',
  fixedSchema
);
// { isCompatible: true, errors: [] }
 
// 7. Registration succeeds
await registry.register('order-created-value', fixedSchema);

CI/CD integration for schema validation:

.github/workflows/schema-validation.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
name: Schema Compatibility Check
 
on:
  pull_request:
    paths:
      - 'schemas/**'
      - 'src/main/avro/**'
 
jobs:
  check-compatibility:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Check Schema Compatibility
        run: |
          # Check each modified schema against registry
          for schema_file in $(git diff --name-only origin/main -- 'schemas/*.avsc'); do
            subject=$(basename $schema_file .avsc)-value
            
            echo "Checking $subject..."
            
            response=$(curl -s -X POST \
              -H "Content-Type: application/vnd.schemaregistry.v1+json" \
              --data-binary "@$schema_file" \
              "$SCHEMA_REGISTRY_URL/compatibility/subjects/$subject/versions/latest")
            
            if [ "$(echo $response | jq .is_compatible)" != "true" ]; then
              echo "❌ Schema $subject is NOT compatible!"
              echo "$response" | jq .
              exit 1
            fi
            
            echo "✅ Schema $subject is compatible"
          done
          
      - name: Register Schemas (on merge)
        if: github.event_name == 'push' && github.ref == 'refs/heads/main'
        run: |
          for schema_file in schemas/*.avsc; do
            subject=$(basename $schema_file .avsc)-value
            curl -X POST \
              -H "Content-Type: application/vnd.schemaregistry.v1+json" \
              --data-binary "@$schema_file" \
              "$SCHEMA_REGISTRY_URL/subjects/$subject/versions"
          done

Block Merges on Incompatibility

Configure CI to block pull request merges when schema compatibility checks fail. This is your last line of defense before an incompatible schema reaches production. Never allow "force push" to bypass this check.

Schema Governance

Beyond storage and validation, mature organizations implement schema governance—policies and practices for managing schemas as assets.

Governance dimensions:

Schema Governance Elements

•Ownership — Every schema has a team owner responsible for its evolution
•Standards — Naming conventions, documentation requirements, field patterns
•Lifecycle — Deprecation, sunset, and removal policies
•Approval workflow — Review process for schema changes
•Impact analysis — Understanding which consumers are affected by changes
•Discoverability — Cataloging and searching schemas

schema-metadata.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
// Schema metadata for governance
interface SchemaMetadata {
  // Ownership
  owner: {
    team: string;              // "commerce-team"
    contact: string;           // "commerce@company.com"
    slackChannel: string;      // "#commerce-platform"
  };
  
  // Lifecycle
  lifecycle: {
    status: 'DRAFT' | 'ACTIVE' | 'DEPRECATED' | 'SUNSET';
    createdAt: Date;
    deprecatedAt?: Date;
    sunsetDate?: Date;
    successor?: string;        // New schema replacing this one
  };
  
  // Documentation
  documentation: {
    description: string;
    usageNotes: string;
    samplePayload: string;
    changeLog: ChangeLogEntry[];
  };
  
  // Compliance
  compliance: {
    containsPII: boolean;
    dataClassification: 'PUBLIC' | 'INTERNAL' | 'CONFIDENTIAL' | 'RESTRICTED';
    retentionPolicy: string;
    gdprRelevant: boolean;
  };
  
  // Dependencies
  consumers: ConsumerInfo[];   // Known consumers of this schema
  dependencies: string[];      // Other schemas this schema references
}
 
// Apicurio-style metadata rules
const schemaRules = {
  // Require description
  {
    type: 'VALIDITY',
    config: {
      requireDescription: true,
      minDescriptionLength: 50,
    },
  },
  // Require owner metadata
  {
    type: 'METADATA',
    config: {
      requiredLabels: ['owner-team', 'data-classification'],
    },
  },
  // Naming convention
  {
    type: 'NAMING',
    config: {
      pattern: '^[a-z]+-[a-z]+(-[a-z]+)*$',  // kebab-case
      message: 'Subject names must be kebab-case',
    },
  },
};

Schema Lifecycle States
State	Description	Actions Allowed	Consumer Guidance
DRAFT	Under development; not for consumption	Create, update, delete	Do not consume
ACTIVE	Production-ready; fully supported	Minor updates (compatible)	Safe to consume
DEPRECATED	Supported but discouraged	Bug fixes only	Migrate to successor
SUNSET	End of support announced	Bug fixes only (emergency)	Must migrate by date
RETIRED	No longer supported	None (read-only)	Must have migrated

Schema Catalog UI

Invest in a schema catalog UI that makes schemas browsable and searchable. Developers should be able to discover schemas, see their documentation, view sample payloads, and understand who owns them—all without reading code or asking on Slack.

Performance and Operations

The schema registry is on the critical path for every event published and consumed. Its reliability and performance directly impact your event-driven system.

Performance considerations:

Registry Performance Optimization

•Client-side caching — Cache schemas locally; avoid fetches on every message
•Cache warming — Pre-load relevant schemas on application startup
•Multiple replicas — Deploy registry behind load balancer; scale reads
•Read replicas — For very high read volume, consider read-only replicas
•CDN caching — For globally distributed consumers, cache schemas at edge

registry-operations.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
// Best practices for registry performance
 
// 1. Client-side caching with TTL
const schemaCache = new LRUCache<number, Schema>({
  max: 1000,           // Max schemas to cache
  ttl: 1000 * 60 * 60, // 1 hour TTL (schemas rarely change)
});
 
async function getSchema(schemaId: number): Promise<Schema> {
  const cached = schemaCache.get(schemaId);
  if (cached) return cached;
  
  const schema = await registry.getSchema(schemaId);
  schemaCache.set(schemaId, schema);
  return schema;
}
 
// 2. Warm cache on startup
async function warmSchemaCache(subjects: string[]) {
  console.log('Warming schema cache...');
  
  for (const subject of subjects) {
    const versions = await registry.getAllVersions(subject);
    for (const version of versions) {
      const schema = await registry.getSchemaByVersion(subject, version);
      schemaCache.set(schema.id, schema);
    }
  }
  
  console.log(`Cached ${schemaCache.size} schemas`);
}
 
// 3. Fallback for registry unavailability
class ResilientSchemaClient {
  private localFallback: Map<number, Schema>;
  
  constructor() {
    // Load critical schemas from bundled backup
    this.localFallback = loadBundledSchemas();
  }
  
  async getSchema(schemaId: number): Promise<Schema> {
    try {
      return await this.fetchWithRetry(schemaId);
    } catch (error) {
      // Registry unreachable - use bundled fallback
      const fallback = this.localFallback.get(schemaId);
      if (fallback) {
        logger.warn(`Using bundled schema for ID ${schemaId}`);
        return fallback;
      }
      throw new Error(`Schema ${schemaId} unavailable; no fallback`);
    }
  }
}
 
// 4. Health check and alerting
async function checkRegistryHealth(): Promise<HealthStatus> {
  const start = Date.now();
  
  try {
    await registry.listSubjects();
    const latency = Date.now() - start;
    
    return {
      status: latency < 100 ? 'HEALTHY' : 'DEGRADED',
      latency,
    };
  } catch (error) {
    return { status: 'UNHEALTHY', error: error.message };
  }
}

Registry is a Single Point of Failure

If the schema registry becomes unavailable and caches expire, producers and consumers may fail. Ensure high availability through replication, implement generous client-side caching, and consider bundling critical schemas as a last-resort fallback.

Summary: Schema Registry

A schema registry centralizes schema management, enabling discovery, validation, and governance at scale. Let's consolidate the key takeaways:

Key Takeaways

•Schema registry is the source of truth — Centralized storage and discovery for all event schemas.
•Compatibility validation is the killer feature — Rejects incompatible schemas before production.
•Choose registry based on ecosystem — Confluent for Kafka, Apicurio for flexibility, cloud-native for managed.
•Design-time registration is safer — Check compatibility in CI, not at runtime.
•Cache schemas aggressively — Avoid per-message fetches; warm cache at startup.
•Integrate with CI/CD — Block merges when compatibility checks fail.
•Implement schema governance — Ownership, lifecycle, documentation, and discoverability.
•Plan for high availability — Registry on critical path; ensure reliability matches event system.

What's next:

With schema registry providing centralized management, the final page explores migration strategies—how to evolve schemas when changes can't be backward compatible, including event upcasting, dual writes, and coordinated migrations.

Page Complete

You now understand schema registries: their purpose, architecture, integration patterns, and operational considerations. You can select, deploy, and integrate a schema registry that enforces compatibility and enables schema discovery across your organization.

4 / 5

Loading learning content...

System Design (HLD)Event Schema Evolution

Event Schema Evolution

LevelAdvanced

Duration90 mins

TopicEvent Schema Evolution

4 / 5

Schema Registry

The Central Source of Truth

A schema registry transforms schema management from tribal knowledge ("Ask the Order team what fields they send") to discoverable infrastructure ("Query the registry for OrderCreated v2.3").

What You Will Learn

What is a Schema Registry?

A schema registry is a centralized service that manages the schemas for your event-driven system. It provides:

Core capabilities:

Schema Registry Functions

•Schema Storage — Persist schema definitions with unique identifiers
•Version Management — Track schema evolution over time; maintain history
•Compatibility Validation — Enforce that new schemas are compatible with old
•Schema Discovery — Allow consumers and producers to look up schemas
•Serialization Support — Provide schemas for binary serialization (Avro, Protobuf)
•Governance — Apply policies, ownership, and lifecycle management

Converting Mermaid diagram...

The schema registry in the event flow:

Producer development: Define schema, register with registry, receive schema ID
Event production: Embed schema ID in message header; serialize with schema
Consumer development: Discover schemas from registry; generate code
Event consumption: Fetch schema by ID; deserialize message
Evolution: Register new version; registry validates compatibility

Schema ID vs. Version

Why Use a Schema Registry?

Problems without a registry:

Schema Management Challenges
Challenge	Without Registry	With Registry
Discovery	Search Git repos; ask on Slack	Query API; browse catalog
Version history	Git blame; no semantic versioning	First-class version tracking
Compatibility	Manual review; hope for the best	Automated validation on commit
Runtime lookup	Embed full schema in each event	Embed schema ID; fetch on demand
Governance	Documentation (often outdated)	Metadata, ownership, lifecycle
Serialization	JSON everywhere (size, no types)	Binary formats with schema

Concrete benefits:

Registry Benefits

•Compact messages — Events carry schema ID (4 bytes) instead of full schema (kilobytes)
•Type safety — Generate typed code from schemas; catch errors at compile time
•Breaking change prevention — CI blocks incompatible schemas before deployment
•Self-service discovery — Teams find schemas without asking; reduces coordination
•Audit trail — Every schema change is recorded; who, when, what
•Consistent serialization — All services use same schema; no deserialization surprises

The Rule of 10/10

Schema Registry Architecture

Schema registries follow a common architectural pattern, regardless of implementation:

Core components:

schema-registry-architecture.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
// Conceptual schema registry architecture
 
interface SchemaRegistry {
  // Subject: Logical grouping of schema versions (e.g., "order-created")
  // Schema: Definition in specific format (Avro, Protobuf, JSON Schema)
  // SchemaId: Globally unique identifier
  // Version: Sequential version within subject
  
  // Core CRUD operations
  registerSchema(subject: string, schema: Schema): Promise<RegisterResult>;
  getSchema(schemaId: number): Promise<Schema>;
  getSchemaByVersion(subject: string, version: number): Promise<Schema>;
  getLatestSchema(subject: string): Promise<Schema>;
  listVersions(subject: string): Promise<VersionInfo[]>;
  listSubjects(): Promise<string[]>;
  
  // Compatibility operations
  checkCompatibility(subject: string, newSchema: Schema): Promise<CompatResult>;
  getCompatibilityConfig(subject: string): Promise<CompatibilityMode>;
  setCompatibilityConfig(subject: string, mode: CompatibilityMode): Promise<void>;
  
  // Lifecycle operations  
  deleteSchema(subject: string, version: number): Promise<void>;
  deleteSubject(subject: string): Promise<void>;
}
 
interface RegisterResult {
  schemaId: number;      // Globally unique ID
  version: number;       // Version within subject
  isNew: boolean;        // false if identical schema already exists
}
 
interface VersionInfo {
  version: number;
  schemaId: number;
  createdAt: Date;
  isDeprecated: boolean;
}
 
enum CompatibilityMode {
  NONE = 'NONE',                        // No compatibility check
  BACKWARD = 'BACKWARD',                // New can read old
  BACKWARD_TRANSITIVE = 'BACKWARD_TRANSITIVE',
  FORWARD = 'FORWARD',                  // Old can read new
  FORWARD_TRANSITIVE = 'FORWARD_TRANSITIVE',
  FULL = 'FULL',                        // Both directions
  FULL_TRANSITIVE = 'FULL_TRANSITIVE',  // Both directions, all versions
}

Storage patterns:

Kafka-backed (Confluent): Schemas stored in a compacted Kafka topic; distributed reads via REST API
Database-backed (Apicurio): Schemas in PostgreSQL/MySQL; standard RDBMS features (transactions, queries)
Cloud-native (AWS Glue, Azure Schema Registry): Managed service; integrates with cloud messaging

Subject naming conventions:

Convention	Example	When to Use
Topic name	`orders-created`	Single event type per topic
Topic + type	`orders-created-value`	Kafka-style; key and value schemas
Domain.event	`commerce.OrderCreated`	Enterprise namespacing
Service.event	`order-service/OrderCreated`	Service ownership clarity

Subject Strategy

Popular Schema Registries

Several schema registries are available, each with different strengths:

Major options:

Confluent Schema Registry is the most widely used, especially in Kafka ecosystems.

Strengths:

Deep Kafka integration; first-class support in Kafka clients
Mature compatibility checking; battle-tested at scale
Supports Avro, Protobuf, JSON Schema
Extensive ecosystem (KSQL, Connect, etc.)

Limitations:

Requires Kafka for storage (even if using other brokers)
Commercial features require Confluent Platform license

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Register a schema with Confluent Schema Registry
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  --data '{"schema": "{"type":"record","name":"OrderCreated","fields":[...]"}' \
  http://localhost:8081/subjects/order-created-value/versions
 
# Response: {"id": 1}
 
# Fetch schema by ID
curl http://localhost:8081/schemas/ids/1
 
# Check compatibility before registering
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  --data '{"schema": "<new-schema>"}' \
  http://localhost:8081/compatibility/subjects/order-created-value/versions/latest

Choosing a Registry

Integrating with Producers

Design-time registration (recommended):

design-time-registration.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// Design-time: Register schema during CI/CD pipeline
// gradle/maven task or npm script
 
// build.gradle.kts
plugins {
    id("com.github.imflog.kafka-schema-registry-gradle-plugin")
}
 
schemaRegistry {
    url.set("http://schema-registry:8081")
    
    register {
        subject("order-created-value", "src/main/avro/OrderCreated.avsc")
        subject("order-updated-value", "src/main/avro/OrderUpdated.avsc")
    }
    
    compatibility {
        subject("order-created-value", "src/main/avro/OrderCreated.avsc")
    }
}
 
// CI/CD pipeline
// 1. ./gradlew schemaRegistryCompatibilityCheck  (fail if incompatible)
// 2. ./gradlew schemaRegistryRegister            (register new version)
// 3. Deploy application
 
// Application uses known schema ID
const SCHEMA_ID = process.env.ORDER_CREATED_SCHEMA_ID;  // 42
 
async function publishOrder(order: Order) {
  const encoded = avroEncode(order, await getSchemaById(SCHEMA_ID));
  await kafka.send({
    topic: 'orders.created',
    messages: [{
      key: order.id,
      value: encoded,
      headers: { 'schema-id': SCHEMA_ID.toString() },
    }],
  });
}

Runtime registration (alternative):

runtime-registration.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// Runtime: Register schema on first message
// Simpler setup but less control
 
import { SchemaRegistry, AvroSerializer } from '@kafkajs/confluent-schema-registry';
 
const registry = new SchemaRegistry({
  host: 'http://schema-registry:8081',
});
 
// Serializer auto-registers schema if not exists
const serializer = new AvroSerializer(registry, {
  autoRegisterSchemas: true,  // Registers on first use
  subject: 'order-created-value',
});
 
async function publishOrder(order: Order) {
  // Serializer embeds schema ID in payload magic byte
  const encoded = await serializer.serialize(order, schema);
  
  await kafka.send({
    topic: 'orders.created',
    messages: [{
      key: order.id,
      value: encoded,  // Magic byte + schema ID + Avro data
    }],
  });
}
 
// Warning: Runtime registration risks
// - No compatibility check before production
// - Schema must be identical across producer instances
// - Network failures during registration block publishing

Design-Time is Safer

Integrating with Consumers

Consumers fetch schemas from the registry to deserialize events. This can happen on-demand (per message) or cached (fetch once, reuse).

On-demand with caching (typical pattern):

consumer-integration.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Consumer: Deserialize with schema from registry
import { SchemaRegistry, AvroDeserializer } from '@kafkajs/confluent-schema-registry';
 
const registry = new SchemaRegistry({
  host: 'http://schema-registry:8081',
  // Cache schemas to avoid repeated fetches
  schemaCache: new Map(),
  idCache: new Map(),
});
 
const deserializer = new AvroDeserializer(registry);
 
await consumer.run({
  eachMessage: async ({ message }) => {
    // Deserializer reads schema ID from message prefix
    // Fetches schema from registry (cached after first fetch)
    // Deserializes Avro data using schema
    const order = await deserializer.deserialize(message.value);
    
    await processOrder(order);
  },
});
 
// Under the hood:
// 1. Read magic byte (0x00) and schema ID (4 bytes) from message prefix
// 2. Check cache for schema ID
// 3. If not cached, fetch GET /schemas/ids/{id}
// 4. Cache schema for future messages
// 5. Deserialize remaining bytes with schema

Consumer schema evolution patterns:

consumer-evolution.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Pattern 1: Evolve with registry (specific reader schema)
const registry = new SchemaRegistry({ host: '...' });
 
// Consumer specifies reader schema (its view of the data)
const readerSchema = await registry.getLatestSchemaForSubject('order-created-value');
const deserializer = new AvroDeserializer(registry, {
  readerSchema: readerSchema,  // Use latest schema we understand
});
 
// Avro resolves differences between writer and reader schemas
// - Extra fields in writer: ignored
// - Missing fields in writer: use defaults from reader
// - Type promotions: automatic (int -> long)
 
// Pattern 2: Multiple schema versions support
class OrderConsumer {
  private schemaHandlers = new Map<number, (data: any) => Order>();
  
  async initialize() {
    // Fetch all known schema versions
    const versions = await registry.getAllVersions('order-created-value');
    
    for (const version of versions) {
      const schema = await registry.getSchemaByVersion('order-created-value', version);
      this.schemaHandlers.set(
        schema.id,
        this.createHandler(schema)
      );
    }
  }
  
  async process(message: Buffer): Promise<Order> {
    const schemaId = readSchemaId(message);  // Read from magic bytes
    const handler = this.schemaHandlers.get(schemaId);
    
    if (!handler) {
      // Unknown schema - fetch and add dynamically
      const schema = await registry.getSchema(schemaId);
      this.schemaHandlers.set(schemaId, this.createHandler(schema));
    }
    
    const rawData = deserialize(message, schema);
    return handler(rawData);
  }
}

Pre-cache Schemas at Startup

Compatibility Enforcement

The killer feature of schema registries is automated compatibility enforcement. The registry rejects incompatible schemas before they reach production.

How compatibility checking works:

compatibility-checking.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
// Compatibility check flow
 
// 1. Developer proposes new schema version
const newSchema = `{
  "type": "record",
  "name": "OrderCreated",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "customerId", "type": "string"},
    {"name": "totalAmount", "type": "double"},
    {"name": "currency", "type": "string"}  // NEW: Added without default!
  ]
}`;
 
// 2. Schema registry checks against existing versions
const checkResult = await registry.testCompatibility(
  'order-created-value',
  newSchema
);
 
// 3. Registry returns compatibility result
console.log(checkResult);
// {
//   isCompatible: false,
//   errors: [
//     {
//       errorType: 'READER_FIELD_MISSING_DEFAULT_VALUE',
//       message: 'Field "currency" has no default value'
//     }
//   ]
// }
 
// 4. Registration rejected
try {
  await registry.register('order-created-value', newSchema);
} catch (error) {
  // SchemaRegistryError: Schema is not compatible
}
 
// 5. Developer fixes schema (add default)
const fixedSchema = `{
  "type": "record",
  "name": "OrderCreated",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "customerId", "type": "string"},
    {"name": "totalAmount", "type": "double"},
    {"name": "currency", "type": "string", "default": "USD"}  // FIXED!
  ]
}`;
 
// 6. New check passes
const fixedResult = await registry.testCompatibility(
  'order-created-value',
  fixedSchema
);
// { isCompatible: true, errors: [] }
 
// 7. Registration succeeds
await registry.register('order-created-value', fixedSchema);

CI/CD integration for schema validation:

.github/workflows/schema-validation.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
name: Schema Compatibility Check
 
on:
  pull_request:
    paths:
      - 'schemas/**'
      - 'src/main/avro/**'
 
jobs:
  check-compatibility:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Check Schema Compatibility
        run: |
          # Check each modified schema against registry
          for schema_file in $(git diff --name-only origin/main -- 'schemas/*.avsc'); do
            subject=$(basename $schema_file .avsc)-value
            
            echo "Checking $subject..."
            
            response=$(curl -s -X POST \
              -H "Content-Type: application/vnd.schemaregistry.v1+json" \
              --data-binary "@$schema_file" \
              "$SCHEMA_REGISTRY_URL/compatibility/subjects/$subject/versions/latest")
            
            if [ "$(echo $response | jq .is_compatible)" != "true" ]; then
              echo "❌ Schema $subject is NOT compatible!"
              echo "$response" | jq .
              exit 1
            fi
            
            echo "✅ Schema $subject is compatible"
          done
          
      - name: Register Schemas (on merge)
        if: github.event_name == 'push' && github.ref == 'refs/heads/main'
        run: |
          for schema_file in schemas/*.avsc; do
            subject=$(basename $schema_file .avsc)-value
            curl -X POST \
              -H "Content-Type: application/vnd.schemaregistry.v1+json" \
              --data-binary "@$schema_file" \
              "$SCHEMA_REGISTRY_URL/subjects/$subject/versions"
          done

Block Merges on Incompatibility

Schema Governance

Beyond storage and validation, mature organizations implement schema governance—policies and practices for managing schemas as assets.

Governance dimensions:

Schema Governance Elements

•Ownership — Every schema has a team owner responsible for its evolution
•Standards — Naming conventions, documentation requirements, field patterns
•Lifecycle — Deprecation, sunset, and removal policies
•Approval workflow — Review process for schema changes
•Impact analysis — Understanding which consumers are affected by changes
•Discoverability — Cataloging and searching schemas

schema-metadata.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
// Schema metadata for governance
interface SchemaMetadata {
  // Ownership
  owner: {
    team: string;              // "commerce-team"
    contact: string;           // "commerce@company.com"
    slackChannel: string;      // "#commerce-platform"
  };
  
  // Lifecycle
  lifecycle: {
    status: 'DRAFT' | 'ACTIVE' | 'DEPRECATED' | 'SUNSET';
    createdAt: Date;
    deprecatedAt?: Date;
    sunsetDate?: Date;
    successor?: string;        // New schema replacing this one
  };
  
  // Documentation
  documentation: {
    description: string;
    usageNotes: string;
    samplePayload: string;
    changeLog: ChangeLogEntry[];
  };
  
  // Compliance
  compliance: {
    containsPII: boolean;
    dataClassification: 'PUBLIC' | 'INTERNAL' | 'CONFIDENTIAL' | 'RESTRICTED';
    retentionPolicy: string;
    gdprRelevant: boolean;
  };
  
  // Dependencies
  consumers: ConsumerInfo[];   // Known consumers of this schema
  dependencies: string[];      // Other schemas this schema references
}
 
// Apicurio-style metadata rules
const schemaRules = {
  // Require description
  {
    type: 'VALIDITY',
    config: {
      requireDescription: true,
      minDescriptionLength: 50,
    },
  },
  // Require owner metadata
  {
    type: 'METADATA',
    config: {
      requiredLabels: ['owner-team', 'data-classification'],
    },
  },
  // Naming convention
  {
    type: 'NAMING',
    config: {
      pattern: '^[a-z]+-[a-z]+(-[a-z]+)*$',  // kebab-case
      message: 'Subject names must be kebab-case',
    },
  },
};

Schema Lifecycle States
State	Description	Actions Allowed	Consumer Guidance
DRAFT	Under development; not for consumption	Create, update, delete	Do not consume
ACTIVE	Production-ready; fully supported	Minor updates (compatible)	Safe to consume
DEPRECATED	Supported but discouraged	Bug fixes only	Migrate to successor
SUNSET	End of support announced	Bug fixes only (emergency)	Must migrate by date
RETIRED	No longer supported	None (read-only)	Must have migrated

Schema Catalog UI

Performance and Operations

The schema registry is on the critical path for every event published and consumed. Its reliability and performance directly impact your event-driven system.

Performance considerations:

Registry Performance Optimization

•Client-side caching — Cache schemas locally; avoid fetches on every message
•Cache warming — Pre-load relevant schemas on application startup
•Multiple replicas — Deploy registry behind load balancer; scale reads
•Read replicas — For very high read volume, consider read-only replicas
•CDN caching — For globally distributed consumers, cache schemas at edge

registry-operations.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
// Best practices for registry performance
 
// 1. Client-side caching with TTL
const schemaCache = new LRUCache<number, Schema>({
  max: 1000,           // Max schemas to cache
  ttl: 1000 * 60 * 60, // 1 hour TTL (schemas rarely change)
});
 
async function getSchema(schemaId: number): Promise<Schema> {
  const cached = schemaCache.get(schemaId);
  if (cached) return cached;
  
  const schema = await registry.getSchema(schemaId);
  schemaCache.set(schemaId, schema);
  return schema;
}
 
// 2. Warm cache on startup
async function warmSchemaCache(subjects: string[]) {
  console.log('Warming schema cache...');
  
  for (const subject of subjects) {
    const versions = await registry.getAllVersions(subject);
    for (const version of versions) {
      const schema = await registry.getSchemaByVersion(subject, version);
      schemaCache.set(schema.id, schema);
    }
  }
  
  console.log(`Cached ${schemaCache.size} schemas`);
}
 
// 3. Fallback for registry unavailability
class ResilientSchemaClient {
  private localFallback: Map<number, Schema>;
  
  constructor() {
    // Load critical schemas from bundled backup
    this.localFallback = loadBundledSchemas();
  }
  
  async getSchema(schemaId: number): Promise<Schema> {
    try {
      return await this.fetchWithRetry(schemaId);
    } catch (error) {
      // Registry unreachable - use bundled fallback
      const fallback = this.localFallback.get(schemaId);
      if (fallback) {
        logger.warn(`Using bundled schema for ID ${schemaId}`);
        return fallback;
      }
      throw new Error(`Schema ${schemaId} unavailable; no fallback`);
    }
  }
}
 
// 4. Health check and alerting
async function checkRegistryHealth(): Promise<HealthStatus> {
  const start = Date.now();
  
  try {
    await registry.listSubjects();
    const latency = Date.now() - start;
    
    return {
      status: latency < 100 ? 'HEALTHY' : 'DEGRADED',
      latency,
    };
  } catch (error) {
    return { status: 'UNHEALTHY', error: error.message };
  }
}

Registry is a Single Point of Failure

Summary: Schema Registry

A schema registry centralizes schema management, enabling discovery, validation, and governance at scale. Let's consolidate the key takeaways:

Key Takeaways

•Schema registry is the source of truth — Centralized storage and discovery for all event schemas.
•Compatibility validation is the killer feature — Rejects incompatible schemas before production.
•Choose registry based on ecosystem — Confluent for Kafka, Apicurio for flexibility, cloud-native for managed.
•Design-time registration is safer — Check compatibility in CI, not at runtime.
•Cache schemas aggressively — Avoid per-message fetches; warm cache at startup.
•Integrate with CI/CD — Block merges when compatibility checks fail.
•Implement schema governance — Ownership, lifecycle, documentation, and discoverability.
•Plan for high availability — Registry on critical path; ensure reliability matches event system.

What's next:

Page Complete

4 / 5