High-Level Design - Learning Module

Loading content...

0/273

Data Flow Diagrams: Tracing Information Through Systems

Following the Breadcrumbs

Architecture diagrams show what components exist and how they're connected. But they don't tell the complete story. They're like a city map showing buildings and roads—useful, but they don't show the actual traffic: where it originates, how it flows through intersections, and where it ultimately arrives.

Data flow diagrams (DFDs) answer the question: 'What happens to information as it moves through the system?' They trace the journey of data from user input through transformations, validations, enrichments, and ultimately to storage or response.

For system designers, understanding data flow is crucial because:

It reveals processing bottlenecks before they manifest in production
It identifies where data transformations create inconsistency risks
It exposes hidden dependencies between seemingly unrelated components
It forms the basis for understanding latency, throughput, and error propagation

This page teaches you to model data flows effectively, communicate them clearly, and use them to identify design issues before implementation.

What You Will Learn

By the end of this page, you will understand the difference between architecture diagrams and data flow diagrams, master DFD notation and modeling techniques, learn to trace synchronous and asynchronous data paths, and recognize flow patterns that indicate design problems.

What Are Data Flow Diagrams?

A data flow diagram visualizes how data moves through a system. While architecture diagrams focus on components and their relationships, DFDs focus on information—where it comes from, how it's transformed, and where it ends up.

Key Distinctions

Architecture Diagram: Shows components, services, databases, and their connections

'The Order Service connects to the Order Database'
'Messages flow through the Event Bus'

Data Flow Diagram: Shows information movement and transformation

'User input becomes validated order data, which is enriched with pricing, then persisted'
'Raw events are aggregated, filtered, and written to time-series storage'

Both are essential. Architecture diagrams are the structural blueprint; data flow diagrams are the plumbing diagram showing how information flows through those structures.

Architecture Diagrams vs. Data Flow Diagrams
Aspect	Architecture Diagram	Data Flow Diagram
Focus	Components and connections	Information movement and transformation
Nodes represent	Services, databases, infrastructure	Processes, data stores, external entities
Edges represent	API calls, protocols, dependencies	Data in motion, information transfer
Answers	'What exists and how is it connected?'	'What happens to data as it moves?'
Best for	Deployment, scaling, technology choices	Processing logic, latency analysis, data integrity

DFD Applications in System Design

Data flow diagrams are particularly valuable for:

1. End-to-end request tracing: Showing how a user request transforms through the system

2. Data pipeline visualization: Modeling ETL, stream processing, and analytics flows

3. Latency analysis: Identifying which paths add the most processing time

4. Consistency analysis: Understanding where data might become stale or inconsistent

5. Error propagation: Tracing how failures in one area affect downstream processing

6. Compliance mapping: Showing how sensitive data flows for GDPR, HIPAA, PCI compliance

When to Use Each Diagram Type

Use architecture diagrams to explain 'what we're building.' Use data flow diagrams to explain 'how it processes information.' In interviews, you'll typically draw architecture first, then trace specific data flows when the interviewer asks about particular scenarios.

DFD Notation and Elements

Traditional DFD notation (Gane-Sarson or Yourdon-DeMarco) uses four fundamental elements. Modern system design often adapts these while preserving the core concepts.

Element 1: External Entities

What they are: Sources or destinations of data outside the system boundary

Representation: Rectangles or squares (sometimes with shadows)

Examples: Users, external APIs, partner systems, IoT devices

In modern systems: Mobile apps, web browsers, third-party services

Element 2: Processes

What they are: Transformations that change, validate, or route data

Representation: Circles or rounded rectangles

Examples: Validate input, calculate price, enrich with metadata, format response

In modern systems: Microservices, functions, workers, processing stages

Element 3: Data Stores

What they are: Repositories where data is persisted or cached

Representation: Open-ended rectangles (two parallel lines)

Examples: Databases, caches, file systems, message logs

In modern systems: PostgreSQL, Redis, S3, Kafka (when used for storage)

Element 4: Data Flows

What they are: Movement of data between other elements

Representation: Arrows, labeled with the data being transferred

Examples: Order data, User credentials, Payment confirmation, Event payload

Critical: Flows must always be labeled—unlabeled arrows are meaningless

DFD Element Quick Reference
Element	Symbol	Naming Convention	Examples
External Entity	Rectangle	Noun (who/what)	Customer, Partner API, Mobile App
Process	Circle/Rounded Rect	Verb phrase (what it does)	Validate Order, Calculate Tax, Send Notification
Data Store	Open rectangle (\|\|)	Noun (what it stores)	Orders, User Profiles, Event Log
Data Flow	Arrow	Noun (what moves)	Order request, Validated order, Confirmation

┌──────────┐                                              ┌──────────┐
│ Customer │                                              │  Email   │
│          │                                              │ Service  │
└────┬─────┘                                              └────▲─────┘
     │                                                         │
     │ Order request                                           │ Order confirmation
     │ (items, address, payment)                               │ (orderId, summary)
     ▼                                                         │
┌─────────────────┐   Validated order    ┌─────────────────┐  │
│  1.0 Validate   │─────────────────────►│ 2.0 Process     │──┘
│     Order       │                      │     Payment     │
└────────┬────────┘                      └────────┬────────┘
         │                                        │
         │ Validation result                      │ Payment result
         │                                        │
         ▼                                        ▼
    ┌─────────────┐                         ┌─────────────┐
    ║ Validation  ║                         ║  Payment    ║
    ║    Log      ║                         ║  Records    ║
    └─────────────┘                         └─────────────┘

Modern Adaptations

In practice, system designers often blend DFD notation with architecture diagram elements. The key is showing data transformation and movement clearly—the exact notation matters less than consistency and clarity.

Levels of Data Flow Diagrams

Like architecture diagrams with C4 levels, data flow diagrams can be decomposed into levels of increasing detail.

Level 0: Context Diagram

The highest level shows the entire system as a single process, with all external entities and the data flows between them.

Purpose: Establish system scope and external interfaces

Content:

One central circle representing the entire system
All external entities that interact with the system
All data flows crossing the system boundary

Questions answered:

Who sends data to the system?
Who receives data from the system?
What data crosses the boundary?

Level 1: System Diagram

Decomposes the Level 0 system into major processes, showing how data flows between them.

Purpose: Show major processing stages and internal data stores

Content:

Multiple processes representing major functions
Data stores used by those processes
Data flows between processes and to/from external entities

Guideline: 5-9 processes typically (cognitive limit for comprehension)

Level 2+: Process Decomposition

Each Level 1 process can be further decomposed to show detailed processing steps.

Purpose: Detailed understanding of specific processes

Content:

Sub-processes within a parent process
Detailed transformations and validations
Internal data flows and temporary storage

When to go deeper: When a process is too complex to understand as a single box

DFD Level Comparison
Level	Focus	Audience	Typical Element Count
0 (Context)	System boundary	Everyone	1 process, 3-5 external entities
1 (System)	Major functions	Architects, leads	5-9 processes, 2-4 data stores
2+ (Detail)	Specific processes	Implementers	3-7 sub-processes per parent

Balancing Rule

When decomposing a process, all data flows into and out of the parent process must appear in the child diagram. This 'balancing' ensures that decomposition is consistent—no data appears or disappears when zooming in.

Modeling Synchronous Data Flows

Synchronous flows represent request-response patterns where the caller waits for completion. These are the most common patterns in user-facing operations.

Characteristics of Synchronous Flows

Request waits for response: Data flows forward, response flows back
Latency accumulates: Total latency = sum of all steps
Failures propagate: Error at any step fails the entire flow
Atomicity often expected: Caller expects all-or-nothing semantics

Representing Synchronous Flows

Show both request and response paths explicitly:

Request arrow: Data flowing toward processing
Response arrow: Results flowing back to caller
Label both: Different data travels each direction

┌────────────┐
│  Customer  │
└─────┬──────┘
      │ ① Order Request
      │    (items, address)
      ▼
┌─────────────────┐    ② Inventory check    ┌─────────────────┐
│   Order API     │───────────────────────►│  Inventory Svc  │
│                 │◄───────────────────────│                 │
└─────────┬───────┘    ③ Available items   └────────┬────────┘
          │                                          │ read
          │ ④ Validated order                        ▼
          │    (priced items)               ┌─────────────────┐
          ▼                                 ║   Inventory DB  ║
┌─────────────────┐                         └─────────────────┘
│  Payment Svc    │
└─────────┬───────┘
          │ ⑤ Payment request
          │    (amount, card token)
          ▼
┌─────────────────┐    ⑥ Charge     ┌─────────────────┐
│  Payment        │───────────────►│  Stripe API     │
│  Processor      │◄───────────────│  (external)     │
└─────────┬───────┘   ⑦ Success    └─────────────────┘
          │
          │ ⑧ Payment confirmed
          ▼
┌─────────────────┐
│  Order API      │──────┐ ⑨ Create order
└─────────────────┘      │
                         ▼
                  ┌─────────────────┐
                  ║   Orders DB     ║
                  └─────────────────┘
                         │
     ⑩ Order confirmation│
        (orderId, ETA)   ▼
                  ┌─────────────┐
                  │  Customer   │
                  └─────────────┘

Key Observations in Synchronous Flows

Latency analysis: With numbers on each step, calculate total latency:

If each internal step = 20ms
External Stripe call = 200ms
Database writes = 30ms each
Total ≈ 300-400ms end-to-end

Failure points: Each synchronous call is a potential failure. In this flow:

Inventory unavailable → Order fails
Payment declined → Order fails
Database write fails → Inconsistent state (payment taken, no order)

Opportunity for optimization:

Can inventory check and price calculation parallelize?
Should payment confirmation be async (accept order, confirm payment later)?

Synchronous Chain Risk

Long synchronous chains are fragile. If you're showing more than 4-5 synchronous hops in a user-facing flow, consider whether some steps could be asynchronous. Each hop multiplies the probability of timeout or failure.

Modeling Asynchronous Data Flows

Asynchronous flows decouple producers from consumers, allowing independent processing and improved resilience.

Characteristics of Asynchronous Flows

Fire and forget: Producer continues without waiting
Queue/bus intermediary: Messages persist in transit
Eventually consistent: Consumers process when able
Retry-able: Failed processing can be retried
Scalable: Producers and consumers scale independently

Representing Asynchronous Flows

Asynchronous flows should clearly show the decoupling:

Dashed arrows: Distinguish from synchronous solid arrows
Intermediate storage: Show the queue/topic explicitly
Separate concerns: Producer publishes → queue stores → consumer processes

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
SYNCHRONOUS PORTION (customer-facing):
═══════════════════════════════════════
 
┌────────────┐  Order Request   ┌─────────────────┐
│  Customer  │─────────────────►│   Order API     │
└────────────┘                  └────────┬────────┘
                                         │ Create order (PENDING)
       ┌─────────────────────────────────┤
       │                                 ▼
       │                         ┌─────────────┐
       │                         ║  Orders DB  ║
       │                         └─────────────┘
       │
       │ Ack (orderId)
       ▼
┌────────────┐
│  Customer  │  ← Response: "Order received, processing..."
└────────────┘
 
ASYNCHRONOUS PORTION (background):
═══════════════════════════════════════
 
┌─────────────────┐                ┌─────────────────┐
│   Order API     │                │  Fulfillment    │
└────────┬────────┘                │     Worker      │
         │                         └────────▲────────┘
         │ OrderCreated event              │ consume
         │ (orderId, items)                │
         ▼                                 │
┌═══════════════════════════════════════════════════════┐
║                   ORDER EVENTS TOPIC                  ║
║  ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐                     ║
║  │msg 1│ │msg 2│ │msg 3│ │msg 4│ ...                 ║
║  └─────┘ └─────┘ └─────┘ └─────┘                     ║
└═══════════════════════════════════════════════════════┘
                                       │
                                       │ consume
                                       ▼
              ┌────────────────────────┬────────────────────────┐
              │                        │                        │
              ▼                        ▼                        ▼
    ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
    │  Inventory Svc  │    │  Payment Svc    │    │ Notification    │
    │  (reserve)      │    │  (charge)       │    │ Service         │
    └─────────────────┘    └─────────────────┘    └─────────────────┘

Patterns in Asynchronous Flows

Fan-out: One event triggers multiple consumers

OrderCreated → [Inventory, Payment, Notification, Analytics]
Each consumer processes independently

Saga/Choreography: Sequence of events forming a workflow

OrderCreated → InventoryReserved → PaymentCharged → ShipmentCreated
Each step publishes event triggering next step

CQRS (Command Query Responsibility Segregation):

Write path: Commands → Event store
Read path: Events → Projections → Query APIs
Separate flows for writes (commands) and reads (queries)

Event Sourcing:

All state changes stored as events
Current state = replay of all events
Shows data flowing into event store, then to projections

Show the Queue Explicitly

Don't draw a dashed arrow directly between producer and consumer. Always show the intermediate queue or topic—it's where messages live during transit and it's a critical component for reliability, ordering, and replay.

Hybrid Flow Patterns

Real systems combine synchronous and asynchronous patterns. Understanding common hybrid patterns helps you design and diagram effectively.

Pattern 1: Sync Request, Async Processing

Use case: Return acknowledgment quickly, process in background

Flow:

✅ Client → API: Synchronous request
✅ API → Client: Immediate acknowledgment (ID, status: 'processing')
⏳ API → Queue: Publish task for processing
⏳ Worker → Queue: Consume and process
📫 Worker → Client: Notification when complete (push, email)

Examples: File upload processing, report generation, bulk operations

Pattern 2: Backend for Frontend (BFF)

Use case: Aggregate multiple backend calls for one client request

Flow:

Client → BFF: One request
BFF → [Service A, Service B, Service C]: Parallel calls
BFF: Aggregate responses
BFF → Client: Unified response

Diagram: Show fan-out from BFF to services, then aggregation

Pattern 3: Event-Driven with Sync Fallback

Use case: Prefer events but need sync for consistency

Flow:

Command arrives at Service A
Service A persists change locally
Service A publishes event asynchronously
If immediate consistency needed: Service A calls Service B synchronously
Otherwise: Service B eventually receives the event

Diagram: Show both paths—sync line and async event flow

Pattern 4: Saga with Compensations

Use case: Distributed transaction alternative

Flow:

Start → Step 1 (reserve inventory)
Step 1 success → Step 2 (charge payment)
Step 2 failure → Compensate Step 1 (release inventory)
Step 2 success → Step 3 (create shipment)

Diagram: Show happy path forward, compensation path backward (often in different color/style)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
HAPPY PATH (solid arrows):
──────────────────────────
 
     ┌─────────┐     ┌───────────────┐     ┌───────────────┐     ┌────────────┐
     │ Client  │────►│ 1. Reserve    │────►│ 2. Charge     │────►│ 3. Ship    │
     └─────────┘     │    Inventory  │     │    Payment    │     └────────────┘
                     └───────────────┘     └───────────────┘
 
COMPENSATION PATH (dashed arrows):
──────────────────────────────────
 
                     ┌───────────────┐     ┌───────────────┐
                     │ 1c. Release   │◄────│ 2c. Refund    │
                     │    Inventory  │     │    Payment    │
                     └───────────────┘     └───────────────┘
                           ▲                     ▲
                           │                     │
                     failure at step 2    failure at step 3
 
 
COMBINED VIEW:
─────────────
 
                              ┌─────────────┐
                              │   Client    │
                              └──────┬──────┘
                                     │ Start order
                                     ▼
                            ┌─────────────────┐
                            │  1. Reserve     │ ←──────────────────┐
                            │    Inventory    │                    │
                            └────────┬────────┘                    │
                          success    │                             │
                                     ▼               ┌─────────────────────┐
                            ┌─────────────────┐      │ 1c. Release         │
                            │  2. Charge      │ ←────│     Inventory       │
                            │    Payment      │ fail └─────────────────────┘
                            └────────┬────────┘
                          success    │
                                     ▼               ┌─────────────────────┐
                            ┌─────────────────┐      │ 2c. Refund Payment  │
                            │  3. Create      │ ←────│ + Release Inventory │
                            │    Shipment     │ fail └─────────────────────┘
                            └────────┬────────┘
                          success    │
                                     ▼
                            ┌─────────────────┐
                            │  Order Complete │
                            └─────────────────┘

Identifying Data Transformation Points

A key insight from data flow analysis is understanding where and how data transforms. Each transformation is a potential source of bugs, latency, and complexity.

Types of Transformations

Validation: Checking data correctness

Input validation (format, range, required fields)
Business rule validation (inventory available, credit sufficient)
Transforms: raw input → validated input OR error

Enrichment: Adding information

Adding computed fields (subtotal from quantity × price)
Adding referenced data (user name from user ID)
Adding metadata (timestamps, request IDs)
Transforms: core data → enriched data

Format conversion: Changing representation

Protocol translation (REST → gRPC)
Serialization (object → JSON → bytes)
Schema mapping (v1 → v2)
Transforms: format A → format B

Aggregation: Combining multiple inputs

Joining data from multiple sources
Summarizing (count, sum, average)
Window operations (last 5 minutes of data)
Transforms: [input1, input2, ...] → aggregated output

Filtering: Selecting subsets

Removing irrelevant records
Applying access control (hide based on permissions)
Sampling for analytics
Transforms: all data → subset matching criteria

Normalization/Denormalization:

Normalization: Splitting into referenced entities
Denormalization: Embedding related data
Transforms: relational ↔ embedded representation

Transformation Point Analysis
Transformation	Latency Impact	Failure Risk	Consistency Risk
Validation	Low	Medium (rejection)	Low
Enrichment (local)	Low	Low	Medium (stale refs)
Enrichment (external call)	High	High	High
Format conversion	Low	Medium (schema issues)	Low
Aggregation (real-time)	Medium-High	Medium	Medium
Aggregation (batch)	Low (async)	Low	High (lag)

External Enrichment is Expensive

Every time your flow shows 'enrich by calling another service,' you're adding latency and a failure point. Consider: Can this data be cached? Can it be published via events instead of fetched? Can it be embedded at write time rather than joined at read time?

Data Flow Anti-Patterns

Certain data flow patterns indicate design problems. Recognizing these in your diagrams helps catch issues before implementation.

Anti-Pattern 1: The Omniscient Service

One service consumes data from many sources and becomes a bottleneck:

[A] ─┐
[B] ─┼─► [Central Service] ─► [Output]
[C] ─┤
[D] ─┘

Problems: Single point of failure, scaling bottleneck, unrelated changes affect all flows

Solution: Decompose by bounded context, let consumers pull what they need

Anti-Pattern 2: The Data Bouncer

Data flows through a service that adds no value—just passes it along:

[A] ─► [Proxy/Router] ─► [B]

Problems: Added latency with no benefit, unnecessary coupling, operational overhead

Solution: Direct communication where appropriate, or ensure the intermediate service adds genuine value (auth, transformation, rate limiting)

Anti-Pattern 3: Circular Data Flow

Data flows in cycles through services:

[A] ─► [B] ─► [C]
 ▲            │
 └────────────┘

Problems: Infinite loops possible, unclear source of truth, debugging nightmare

Solution: Identify the authoritative source, break cycle with events or clear hierarchy

Anti-Pattern 4: Synchronous Fan-Out to Many

One request triggers synchronous calls to many downstream services:

           ┌─► [Svc1]
           ├─► [Svc2]
[Request] ─┼─► [Svc3]  ← waiting for all
           ├─► [Svc4]
           └─► [Svc5]

Problems: Latency = slowest service, any failure fails all, all-or-nothing semantics

Solution: Parallelize where possible, timeout aggressively, consider async for non-critical paths

Red Flags in Data Flows

•Any service in more than 5 synchronous paths in a single request
•Data flowing through a service that doesn't transform it
•Same data being fetched multiple times in one request
•Cycles in the flow graph
•Single database being read/written by more than 3 services directly
•No queues or topic between any async producers and consumers
•Critical user-facing flows with more than 4 serial hops

Case Study: E-commerce Checkout Data Flow

Let's model the complete data flow for an e-commerce checkout, combining synchronous and asynchronous patterns.

Scenario: Customer clicks 'Place Order' with cart items and payment info.

Level 0: Context View

External entities:

Customer (input: cart, payment, address; output: confirmation)
Payment Provider (input: charge request; output: result)
Shipping Provider (input: shipment request; output: tracking)
Email Service (input: notification request; output: delivery)

┌──────────────────────────────────────────────────────────────────────────────┐
│                           SYNCHRONOUS PHASE                                   │
│                           (Customer Waiting)                                  │
│                                                                              │
│  ┌──────────┐   Cart + Payment   ┌───────────────┐                          │
│  │ Customer │──────────────────►│ Checkout API  │                          │
│  └──────────┘                    └───────┬───────┘                          │
│                                          │                                   │
│              ┌───────────────────────────┼───────────────────────┐          │
│              │                           │                       │          │
│              ▼                           ▼                       ▼          │
│  ┌───────────────────┐    ┌───────────────────┐    ┌───────────────────┐   │
│  │ Validate Cart     │    │ Validate Address  │    │ Apply Promotions  │   │
│  │ (items, prices)   │    │ (delivery zone)   │    │ (discounts, tax)  │   │
│  └─────────┬─────────┘    └─────────┬─────────┘    └─────────┬─────────┘   │
│            │                        │                        │              │
│            └────────────────────────┴────────────────────────┘              │
│                                     │                                        │
│                                     ▼                                        │
│                          ┌───────────────────┐                              │
│                          │ Create Order      │──────┐                       │
│                          │ (status: PENDING) │      │ persist               │
│                          └─────────┬─────────┘      ▼                       │
│                                    │         ┌─────────────┐                │
│                                    │         ║ Orders DB   ║                │
│                          charge    │         └─────────────┘                │
│                          request   ▼                                         │
│                          ┌───────────────────┐                              │
│                          │ Payment Service   │                              │
│                          └─────────┬─────────┘                              │
│                                    │                                         │
│                                    ▼                                         │
│                          ┌───────────────────┐                              │
│                          │ Stripe API        │ ← external                   │
│                          └─────────┬─────────┘                              │
│                            success │                                         │
│                                    ▼                                         │
│                          ┌───────────────────┐                              │
│                          │ Update Order      │                              │
│                          │ (status: PAID)    │                              │
│                          └─────────┬─────────┘                              │
│                                    │                                         │
│                                    ▼                                         │
│                          ┌───────────────────┐                              │
│                          │ Return confirm    │──────────►┌──────────┐      │
│                          │ (orderId, ETA)    │           │ Customer │      │
│                          └─────────┬─────────┘           └──────────┘      │
│                                    │                                         │
└────────────────────────────────────┼─────────────────────────────────────────┘
                                     │ publish event
                                     ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│                          ASYNCHRONOUS PHASE                                   │
│                          (Post-Checkout)                                      │
│                                                                              │
│  ╔═══════════════════════════════════════════════════════════════════════╗  │
│  ║                        ORDER_PAID TOPIC                                ║  │
│  ║  { orderId, items, customer, address, paymentId }                      ║  │
│  ╚═══════════════════════════════════════════════════╤════════════════════╝  │
│                                                      │                        │
│              ┌──────────────────┬────────────────────┼─────────────────┐     │
│              │                  │                    │                 │     │
│              ▼                  ▼                    ▼                 ▼     │
│  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ ┌────────────┐ │
│  │ Inventory Svc   │ │ Fulfillment Svc │ │ Email Service   │ │ Analytics  │ │
│  │ (decrement)     │ │ (create pick)   │ │ (confirmation)  │ │ (event)    │ │
│  └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ └────────────┘ │
│           │                   │                   │                          │
│           ▼                   ▼                   ▼                          │
│  ╔══════════════╗   ╔══════════════╗     ┌───────────────┐                  │
│  ║ Inventory DB ║   ║ Warehouse DB ║     │ Mailgun API   │                  │
│  ╚══════════════╝   ╚══════════════╝     └───────────────┘                  │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

Flow Analysis

Synchronous phase (customer waiting, ~400-800ms):

Parallel validation saves time
Payment is the slowest step (external API)
Order created before payment confirms → rollback needed on payment failure

Asynchronous phase (background, seconds to minutes):

All consumers process independently
Failures don't affect customer experience (already confirmed)
Can retry indefinitely until successful

Transformation points:

Cart → Validated cart (validation)
Cart + promotions → Priced order (enrichment)
Order → Order with payment (state change)
OrderPaid event → Inventory decrement (projection)

Summary: Data Flow Diagrams

Data flow diagrams reveal how information moves and transforms through your system. Let's consolidate the key principles:

Key Takeaways

•DFDs complement architecture diagrams — Architecture shows structure; DFDs show information in motion.
•Four core elements — External entities, processes, data stores, and data flows form the vocabulary.
•Levels enable focus — Context (Level 0), System (Level 1), and Detail (Level 2+) serve different audiences.
•Distinguish sync and async — Synchronous flows block callers; asynchronous flows decouple them. Show queues explicitly.
•Hybrid patterns are common — Sync request with async processing, fan-out aggregation, and sagas combine both patterns.
•Transformation points are risk points — Each transformation adds potential for latency, failure, and inconsistency.
•Recognize anti-patterns — Omniscient services, data bouncers, circular flows, and heavy fan-out indicate design issues.

What's next:

With components identified, architecture diagrammed, and data flows traced, we turn to the interfaces between components: API design. The next page covers how to design APIs that are intuitive, consistent, and evolvable.

Page Complete

You now understand how to model and analyze data flows through distributed systems. This skill enables you to identify bottlenecks, failure points, and consistency risks before implementation—essential for designing reliable systems at scale.

Data Flow Diagrams: Tracing Information Through Systems

Following the Breadcrumbs

For system designers, understanding data flow is crucial because:

It reveals processing bottlenecks before they manifest in production
It identifies where data transformations create inconsistency risks
It exposes hidden dependencies between seemingly unrelated components
It forms the basis for understanding latency, throughput, and error propagation

This page teaches you to model data flows effectively, communicate them clearly, and use them to identify design issues before implementation.

What You Will Learn

What Are Data Flow Diagrams?

Key Distinctions

Architecture Diagram: Shows components, services, databases, and their connections

'The Order Service connects to the Order Database'
'Messages flow through the Event Bus'

Data Flow Diagram: Shows information movement and transformation

'User input becomes validated order data, which is enriched with pricing, then persisted'
'Raw events are aggregated, filtered, and written to time-series storage'

Both are essential. Architecture diagrams are the structural blueprint; data flow diagrams are the plumbing diagram showing how information flows through those structures.

Architecture Diagrams vs. Data Flow Diagrams
Aspect	Architecture Diagram	Data Flow Diagram
Focus	Components and connections	Information movement and transformation
Nodes represent	Services, databases, infrastructure	Processes, data stores, external entities
Edges represent	API calls, protocols, dependencies	Data in motion, information transfer
Answers	'What exists and how is it connected?'	'What happens to data as it moves?'
Best for	Deployment, scaling, technology choices	Processing logic, latency analysis, data integrity

DFD Applications in System Design

Data flow diagrams are particularly valuable for:

1. End-to-end request tracing: Showing how a user request transforms through the system

2. Data pipeline visualization: Modeling ETL, stream processing, and analytics flows

3. Latency analysis: Identifying which paths add the most processing time

4. Consistency analysis: Understanding where data might become stale or inconsistent

5. Error propagation: Tracing how failures in one area affect downstream processing

6. Compliance mapping: Showing how sensitive data flows for GDPR, HIPAA, PCI compliance

When to Use Each Diagram Type

DFD Notation and Elements

Traditional DFD notation (Gane-Sarson or Yourdon-DeMarco) uses four fundamental elements. Modern system design often adapts these while preserving the core concepts.

Element 1: External Entities

What they are: Sources or destinations of data outside the system boundary

Representation: Rectangles or squares (sometimes with shadows)

Examples: Users, external APIs, partner systems, IoT devices

In modern systems: Mobile apps, web browsers, third-party services

Element 2: Processes

What they are: Transformations that change, validate, or route data

Representation: Circles or rounded rectangles

Examples: Validate input, calculate price, enrich with metadata, format response

In modern systems: Microservices, functions, workers, processing stages

Element 3: Data Stores

What they are: Repositories where data is persisted or cached

Representation: Open-ended rectangles (two parallel lines)

Examples: Databases, caches, file systems, message logs

In modern systems: PostgreSQL, Redis, S3, Kafka (when used for storage)

Element 4: Data Flows

What they are: Movement of data between other elements

Representation: Arrows, labeled with the data being transferred

Examples: Order data, User credentials, Payment confirmation, Event payload

Critical: Flows must always be labeled—unlabeled arrows are meaningless

DFD Element Quick Reference
Element	Symbol	Naming Convention	Examples
External Entity	Rectangle	Noun (who/what)	Customer, Partner API, Mobile App
Process	Circle/Rounded Rect	Verb phrase (what it does)	Validate Order, Calculate Tax, Send Notification
Data Store	Open rectangle (\|\|)	Noun (what it stores)	Orders, User Profiles, Event Log
Data Flow	Arrow	Noun (what moves)	Order request, Validated order, Confirmation

┌──────────┐                                              ┌──────────┐
│ Customer │                                              │  Email   │
│          │                                              │ Service  │
└────┬─────┘                                              └────▲─────┘
     │                                                         │
     │ Order request                                           │ Order confirmation
     │ (items, address, payment)                               │ (orderId, summary)
     ▼                                                         │
┌─────────────────┐   Validated order    ┌─────────────────┐  │
│  1.0 Validate   │─────────────────────►│ 2.0 Process     │──┘
│     Order       │                      │     Payment     │
└────────┬────────┘                      └────────┬────────┘
         │                                        │
         │ Validation result                      │ Payment result
         │                                        │
         ▼                                        ▼
    ┌─────────────┐                         ┌─────────────┐
    ║ Validation  ║                         ║  Payment    ║
    ║    Log      ║                         ║  Records    ║
    └─────────────┘                         └─────────────┘

Modern Adaptations

Levels of Data Flow Diagrams

Like architecture diagrams with C4 levels, data flow diagrams can be decomposed into levels of increasing detail.

Level 0: Context Diagram

The highest level shows the entire system as a single process, with all external entities and the data flows between them.

Purpose: Establish system scope and external interfaces

Content:

One central circle representing the entire system
All external entities that interact with the system
All data flows crossing the system boundary

Questions answered:

Who sends data to the system?
Who receives data from the system?
What data crosses the boundary?

Level 1: System Diagram

Decomposes the Level 0 system into major processes, showing how data flows between them.

Purpose: Show major processing stages and internal data stores

Content:

Multiple processes representing major functions
Data stores used by those processes
Data flows between processes and to/from external entities

Guideline: 5-9 processes typically (cognitive limit for comprehension)

Level 2+: Process Decomposition

Each Level 1 process can be further decomposed to show detailed processing steps.

Purpose: Detailed understanding of specific processes

Content:

Sub-processes within a parent process
Detailed transformations and validations
Internal data flows and temporary storage

When to go deeper: When a process is too complex to understand as a single box

DFD Level Comparison
Level	Focus	Audience	Typical Element Count
0 (Context)	System boundary	Everyone	1 process, 3-5 external entities
1 (System)	Major functions	Architects, leads	5-9 processes, 2-4 data stores
2+ (Detail)	Specific processes	Implementers	3-7 sub-processes per parent

Balancing Rule

Modeling Synchronous Data Flows

Synchronous flows represent request-response patterns where the caller waits for completion. These are the most common patterns in user-facing operations.

Characteristics of Synchronous Flows

Request waits for response: Data flows forward, response flows back
Latency accumulates: Total latency = sum of all steps
Failures propagate: Error at any step fails the entire flow
Atomicity often expected: Caller expects all-or-nothing semantics

Representing Synchronous Flows

Show both request and response paths explicitly:

Request arrow: Data flowing toward processing
Response arrow: Results flowing back to caller
Label both: Different data travels each direction

┌────────────┐
│  Customer  │
└─────┬──────┘
      │ ① Order Request
      │    (items, address)
      ▼
┌─────────────────┐    ② Inventory check    ┌─────────────────┐
│   Order API     │───────────────────────►│  Inventory Svc  │
│                 │◄───────────────────────│                 │
└─────────┬───────┘    ③ Available items   └────────┬────────┘
          │                                          │ read
          │ ④ Validated order                        ▼
          │    (priced items)               ┌─────────────────┐
          ▼                                 ║   Inventory DB  ║
┌─────────────────┐                         └─────────────────┘
│  Payment Svc    │
└─────────┬───────┘
          │ ⑤ Payment request
          │    (amount, card token)
          ▼
┌─────────────────┐    ⑥ Charge     ┌─────────────────┐
│  Payment        │───────────────►│  Stripe API     │
│  Processor      │◄───────────────│  (external)     │
└─────────┬───────┘   ⑦ Success    └─────────────────┘
          │
          │ ⑧ Payment confirmed
          ▼
┌─────────────────┐
│  Order API      │──────┐ ⑨ Create order
└─────────────────┘      │
                         ▼
                  ┌─────────────────┐
                  ║   Orders DB     ║
                  └─────────────────┘
                         │
     ⑩ Order confirmation│
        (orderId, ETA)   ▼
                  ┌─────────────┐
                  │  Customer   │
                  └─────────────┘

Key Observations in Synchronous Flows

Latency analysis: With numbers on each step, calculate total latency:

If each internal step = 20ms
External Stripe call = 200ms
Database writes = 30ms each
Total ≈ 300-400ms end-to-end

Failure points: Each synchronous call is a potential failure. In this flow:

Inventory unavailable → Order fails
Payment declined → Order fails
Database write fails → Inconsistent state (payment taken, no order)

Opportunity for optimization:

Can inventory check and price calculation parallelize?
Should payment confirmation be async (accept order, confirm payment later)?

Synchronous Chain Risk

Modeling Asynchronous Data Flows

Asynchronous flows decouple producers from consumers, allowing independent processing and improved resilience.

Characteristics of Asynchronous Flows

Fire and forget: Producer continues without waiting
Queue/bus intermediary: Messages persist in transit
Eventually consistent: Consumers process when able
Retry-able: Failed processing can be retried
Scalable: Producers and consumers scale independently

Representing Asynchronous Flows

Asynchronous flows should clearly show the decoupling:

Dashed arrows: Distinguish from synchronous solid arrows
Intermediate storage: Show the queue/topic explicitly
Separate concerns: Producer publishes → queue stores → consumer processes

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
SYNCHRONOUS PORTION (customer-facing):
═══════════════════════════════════════
 
┌────────────┐  Order Request   ┌─────────────────┐
│  Customer  │─────────────────►│   Order API     │
└────────────┘                  └────────┬────────┘
                                         │ Create order (PENDING)
       ┌─────────────────────────────────┤
       │                                 ▼
       │                         ┌─────────────┐
       │                         ║  Orders DB  ║
       │                         └─────────────┘
       │
       │ Ack (orderId)
       ▼
┌────────────┐
│  Customer  │  ← Response: "Order received, processing..."
└────────────┘
 
ASYNCHRONOUS PORTION (background):
═══════════════════════════════════════
 
┌─────────────────┐                ┌─────────────────┐
│   Order API     │                │  Fulfillment    │
└────────┬────────┘                │     Worker      │
         │                         └────────▲────────┘
         │ OrderCreated event              │ consume
         │ (orderId, items)                │
         ▼                                 │
┌═══════════════════════════════════════════════════════┐
║                   ORDER EVENTS TOPIC                  ║
║  ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐                     ║
║  │msg 1│ │msg 2│ │msg 3│ │msg 4│ ...                 ║
║  └─────┘ └─────┘ └─────┘ └─────┘                     ║
└═══════════════════════════════════════════════════════┘
                                       │
                                       │ consume
                                       ▼
              ┌────────────────────────┬────────────────────────┐
              │                        │                        │
              ▼                        ▼                        ▼
    ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
    │  Inventory Svc  │    │  Payment Svc    │    │ Notification    │
    │  (reserve)      │    │  (charge)       │    │ Service         │
    └─────────────────┘    └─────────────────┘    └─────────────────┘

Patterns in Asynchronous Flows

Fan-out: One event triggers multiple consumers

OrderCreated → [Inventory, Payment, Notification, Analytics]
Each consumer processes independently

Saga/Choreography: Sequence of events forming a workflow

OrderCreated → InventoryReserved → PaymentCharged → ShipmentCreated
Each step publishes event triggering next step

CQRS (Command Query Responsibility Segregation):

Write path: Commands → Event store
Read path: Events → Projections → Query APIs
Separate flows for writes (commands) and reads (queries)

Event Sourcing:

All state changes stored as events
Current state = replay of all events
Shows data flowing into event store, then to projections

Show the Queue Explicitly

Hybrid Flow Patterns

Real systems combine synchronous and asynchronous patterns. Understanding common hybrid patterns helps you design and diagram effectively.

Pattern 1: Sync Request, Async Processing

Use case: Return acknowledgment quickly, process in background

Flow:

✅ Client → API: Synchronous request
✅ API → Client: Immediate acknowledgment (ID, status: 'processing')
⏳ API → Queue: Publish task for processing
⏳ Worker → Queue: Consume and process
📫 Worker → Client: Notification when complete (push, email)

Examples: File upload processing, report generation, bulk operations

Pattern 2: Backend for Frontend (BFF)

Use case: Aggregate multiple backend calls for one client request

Flow:

Client → BFF: One request
BFF → [Service A, Service B, Service C]: Parallel calls
BFF: Aggregate responses
BFF → Client: Unified response

Diagram: Show fan-out from BFF to services, then aggregation

Pattern 3: Event-Driven with Sync Fallback

Use case: Prefer events but need sync for consistency

Flow:

Command arrives at Service A
Service A persists change locally
Service A publishes event asynchronously
If immediate consistency needed: Service A calls Service B synchronously
Otherwise: Service B eventually receives the event

Diagram: Show both paths—sync line and async event flow

Pattern 4: Saga with Compensations

Use case: Distributed transaction alternative

Flow:

Start → Step 1 (reserve inventory)
Step 1 success → Step 2 (charge payment)
Step 2 failure → Compensate Step 1 (release inventory)
Step 2 success → Step 3 (create shipment)

Diagram: Show happy path forward, compensation path backward (often in different color/style)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
HAPPY PATH (solid arrows):
──────────────────────────
 
     ┌─────────┐     ┌───────────────┐     ┌───────────────┐     ┌────────────┐
     │ Client  │────►│ 1. Reserve    │────►│ 2. Charge     │────►│ 3. Ship    │
     └─────────┘     │    Inventory  │     │    Payment    │     └────────────┘
                     └───────────────┘     └───────────────┘
 
COMPENSATION PATH (dashed arrows):
──────────────────────────────────
 
                     ┌───────────────┐     ┌───────────────┐
                     │ 1c. Release   │◄────│ 2c. Refund    │
                     │    Inventory  │     │    Payment    │
                     └───────────────┘     └───────────────┘
                           ▲                     ▲
                           │                     │
                     failure at step 2    failure at step 3
 
 
COMBINED VIEW:
─────────────
 
                              ┌─────────────┐
                              │   Client    │
                              └──────┬──────┘
                                     │ Start order
                                     ▼
                            ┌─────────────────┐
                            │  1. Reserve     │ ←──────────────────┐
                            │    Inventory    │                    │
                            └────────┬────────┘                    │
                          success    │                             │
                                     ▼               ┌─────────────────────┐
                            ┌─────────────────┐      │ 1c. Release         │
                            │  2. Charge      │ ←────│     Inventory       │
                            │    Payment      │ fail └─────────────────────┘
                            └────────┬────────┘
                          success    │
                                     ▼               ┌─────────────────────┐
                            ┌─────────────────┐      │ 2c. Refund Payment  │
                            │  3. Create      │ ←────│ + Release Inventory │
                            │    Shipment     │ fail └─────────────────────┘
                            └────────┬────────┘
                          success    │
                                     ▼
                            ┌─────────────────┐
                            │  Order Complete │
                            └─────────────────┘

Identifying Data Transformation Points

A key insight from data flow analysis is understanding where and how data transforms. Each transformation is a potential source of bugs, latency, and complexity.

Types of Transformations

Validation: Checking data correctness

Input validation (format, range, required fields)
Business rule validation (inventory available, credit sufficient)
Transforms: raw input → validated input OR error

Enrichment: Adding information

Adding computed fields (subtotal from quantity × price)
Adding referenced data (user name from user ID)
Adding metadata (timestamps, request IDs)
Transforms: core data → enriched data

Format conversion: Changing representation

Protocol translation (REST → gRPC)
Serialization (object → JSON → bytes)
Schema mapping (v1 → v2)
Transforms: format A → format B

Aggregation: Combining multiple inputs

Joining data from multiple sources
Summarizing (count, sum, average)
Window operations (last 5 minutes of data)
Transforms: [input1, input2, ...] → aggregated output

Filtering: Selecting subsets

Removing irrelevant records
Applying access control (hide based on permissions)
Sampling for analytics
Transforms: all data → subset matching criteria

Normalization/Denormalization:

Normalization: Splitting into referenced entities
Denormalization: Embedding related data
Transforms: relational ↔ embedded representation

Transformation Point Analysis
Transformation	Latency Impact	Failure Risk	Consistency Risk
Validation	Low	Medium (rejection)	Low
Enrichment (local)	Low	Low	Medium (stale refs)
Enrichment (external call)	High	High	High
Format conversion	Low	Medium (schema issues)	Low
Aggregation (real-time)	Medium-High	Medium	Medium
Aggregation (batch)	Low (async)	Low	High (lag)

External Enrichment is Expensive

Data Flow Anti-Patterns

Certain data flow patterns indicate design problems. Recognizing these in your diagrams helps catch issues before implementation.

Anti-Pattern 1: The Omniscient Service

One service consumes data from many sources and becomes a bottleneck:

[A] ─┐
[B] ─┼─► [Central Service] ─► [Output]
[C] ─┤
[D] ─┘

Problems: Single point of failure, scaling bottleneck, unrelated changes affect all flows

Solution: Decompose by bounded context, let consumers pull what they need

Anti-Pattern 2: The Data Bouncer

Data flows through a service that adds no value—just passes it along:

[A] ─► [Proxy/Router] ─► [B]

Problems: Added latency with no benefit, unnecessary coupling, operational overhead

Solution: Direct communication where appropriate, or ensure the intermediate service adds genuine value (auth, transformation, rate limiting)

Anti-Pattern 3: Circular Data Flow

Data flows in cycles through services:

[A] ─► [B] ─► [C]
 ▲            │
 └────────────┘

Problems: Infinite loops possible, unclear source of truth, debugging nightmare

Solution: Identify the authoritative source, break cycle with events or clear hierarchy

Anti-Pattern 4: Synchronous Fan-Out to Many

One request triggers synchronous calls to many downstream services:

           ┌─► [Svc1]
           ├─► [Svc2]
[Request] ─┼─► [Svc3]  ← waiting for all
           ├─► [Svc4]
           └─► [Svc5]

Problems: Latency = slowest service, any failure fails all, all-or-nothing semantics

Solution: Parallelize where possible, timeout aggressively, consider async for non-critical paths

Red Flags in Data Flows

•Any service in more than 5 synchronous paths in a single request
•Data flowing through a service that doesn't transform it
•Same data being fetched multiple times in one request
•Cycles in the flow graph
•Single database being read/written by more than 3 services directly
•No queues or topic between any async producers and consumers
•Critical user-facing flows with more than 4 serial hops

Case Study: E-commerce Checkout Data Flow

Let's model the complete data flow for an e-commerce checkout, combining synchronous and asynchronous patterns.

Scenario: Customer clicks 'Place Order' with cart items and payment info.

Level 0: Context View

External entities:

Customer (input: cart, payment, address; output: confirmation)
Payment Provider (input: charge request; output: result)
Shipping Provider (input: shipment request; output: tracking)
Email Service (input: notification request; output: delivery)

┌──────────────────────────────────────────────────────────────────────────────┐
│                           SYNCHRONOUS PHASE                                   │
│                           (Customer Waiting)                                  │
│                                                                              │
│  ┌──────────┐   Cart + Payment   ┌───────────────┐                          │
│  │ Customer │──────────────────►│ Checkout API  │                          │
│  └──────────┘                    └───────┬───────┘                          │
│                                          │                                   │
│              ┌───────────────────────────┼───────────────────────┐          │
│              │                           │                       │          │
│              ▼                           ▼                       ▼          │
│  ┌───────────────────┐    ┌───────────────────┐    ┌───────────────────┐   │
│  │ Validate Cart     │    │ Validate Address  │    │ Apply Promotions  │   │
│  │ (items, prices)   │    │ (delivery zone)   │    │ (discounts, tax)  │   │
│  └─────────┬─────────┘    └─────────┬─────────┘    └─────────┬─────────┘   │
│            │                        │                        │              │
│            └────────────────────────┴────────────────────────┘              │
│                                     │                                        │
│                                     ▼                                        │
│                          ┌───────────────────┐                              │
│                          │ Create Order      │──────┐                       │
│                          │ (status: PENDING) │      │ persist               │
│                          └─────────┬─────────┘      ▼                       │
│                                    │         ┌─────────────┐                │
│                                    │         ║ Orders DB   ║                │
│                          charge    │         └─────────────┘                │
│                          request   ▼                                         │
│                          ┌───────────────────┐                              │
│                          │ Payment Service   │                              │
│                          └─────────┬─────────┘                              │
│                                    │                                         │
│                                    ▼                                         │
│                          ┌───────────────────┐                              │
│                          │ Stripe API        │ ← external                   │
│                          └─────────┬─────────┘                              │
│                            success │                                         │
│                                    ▼                                         │
│                          ┌───────────────────┐                              │
│                          │ Update Order      │                              │
│                          │ (status: PAID)    │                              │
│                          └─────────┬─────────┘                              │
│                                    │                                         │
│                                    ▼                                         │
│                          ┌───────────────────┐                              │
│                          │ Return confirm    │──────────►┌──────────┐      │
│                          │ (orderId, ETA)    │           │ Customer │      │
│                          └─────────┬─────────┘           └──────────┘      │
│                                    │                                         │
└────────────────────────────────────┼─────────────────────────────────────────┘
                                     │ publish event
                                     ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│                          ASYNCHRONOUS PHASE                                   │
│                          (Post-Checkout)                                      │
│                                                                              │
│  ╔═══════════════════════════════════════════════════════════════════════╗  │
│  ║                        ORDER_PAID TOPIC                                ║  │
│  ║  { orderId, items, customer, address, paymentId }                      ║  │
│  ╚═══════════════════════════════════════════════════╤════════════════════╝  │
│                                                      │                        │
│              ┌──────────────────┬────────────────────┼─────────────────┐     │
│              │                  │                    │                 │     │
│              ▼                  ▼                    ▼                 ▼     │
│  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ ┌────────────┐ │
│  │ Inventory Svc   │ │ Fulfillment Svc │ │ Email Service   │ │ Analytics  │ │
│  │ (decrement)     │ │ (create pick)   │ │ (confirmation)  │ │ (event)    │ │
│  └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ └────────────┘ │
│           │                   │                   │                          │
│           ▼                   ▼                   ▼                          │
│  ╔══════════════╗   ╔══════════════╗     ┌───────────────┐                  │
│  ║ Inventory DB ║   ║ Warehouse DB ║     │ Mailgun API   │                  │
│  ╚══════════════╝   ╚══════════════╝     └───────────────┘                  │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

Flow Analysis

Synchronous phase (customer waiting, ~400-800ms):

Parallel validation saves time
Payment is the slowest step (external API)
Order created before payment confirms → rollback needed on payment failure

Asynchronous phase (background, seconds to minutes):

All consumers process independently
Failures don't affect customer experience (already confirmed)
Can retry indefinitely until successful

Transformation points:

Cart → Validated cart (validation)
Cart + promotions → Priced order (enrichment)
Order → Order with payment (state change)
OrderPaid event → Inventory decrement (projection)

Summary: Data Flow Diagrams

Data flow diagrams reveal how information moves and transforms through your system. Let's consolidate the key principles:

Key Takeaways

•DFDs complement architecture diagrams — Architecture shows structure; DFDs show information in motion.
•Four core elements — External entities, processes, data stores, and data flows form the vocabulary.
•Levels enable focus — Context (Level 0), System (Level 1), and Detail (Level 2+) serve different audiences.
•Distinguish sync and async — Synchronous flows block callers; asynchronous flows decouple them. Show queues explicitly.
•Hybrid patterns are common — Sync request with async processing, fan-out aggregation, and sagas combine both patterns.
•Transformation points are risk points — Each transformation adds potential for latency, failure, and inconsistency.
•Recognize anti-patterns — Omniscient services, data bouncers, circular flows, and heavy fan-out indicate design issues.

What's next:

Page Complete