Loading content...
An ER diagram shows what exists in a data model—entities, attributes, relationships, constraints. But it cannot show why design decisions were made, what business rules govern data values, when elements were added or modified, or who is responsible for what.
This contextual knowledge lives in documentation.
Every organization has experienced the pain of undocumented systems:
Without documentation, every engineer who touches the system must rediscover this knowledge through archaeology—examining code, querying data, interviewing colleagues (if they're still around), or simply guessing. This process wastes time, introduces errors, and frustrates everyone involved.
Documentation is the difference between a data model that remains maintainable for decades and one that becomes a legacy burden within years.
By the end of this page, you will understand how to document ER diagrams comprehensively—including data dictionaries, attribute definitions, business rules, annotations, version history, and metadata standards. You will learn what to document, where to document it, and how to keep documentation accurate over time.
Documentation is often treated as an afterthought—something done (if at all) after the 'real work' of modeling and implementation. This mindset leads to sparse, outdated, or nonexistent documentation.
The True Cost of Undocumented Models
Consider the lifecycle of a data model:
Initial Development (Months 1-6): The design team knows everything implicitly. Documentation seems redundant.
Early Production (Months 6-18): Original team members leave or transition. New developers ask questions that 'everyone knew' the answers to.
Maturity (Years 2-5): The model is modified repeatedly. Nobody remembers original design rationale. Modifications introduce inconsistencies because engineers don't understand the implications.
Legacy Phase (Years 5+): The model is critical but feared. Nobody wants to touch it because understanding it requires too much effort. Technical debt accumulates.
Proper documentation prevents this decay. It preserves institutional knowledge across personnel changes and time.
| Activity | Without Documentation | With Documentation |
|---|---|---|
| New developer onboarding | Weeks of questions, shadowing, trial and error | Self-service learning; targeted questions only |
| Impact analysis for changes | Review all code; test extensively; discover surprises | Review documented dependencies; focused testing |
| Debugging data issues | Explore data to infer rules; guess at semantics | Reference business rules; verify against specification |
| Integration with new system | Extensive meetings to explain the model | Share documentation; answer specific questions |
| Audit and compliance | Scramble to explain data governance | Demonstrate documented lineage and rules |
The best time to document a design decision is when you make it. The knowledge is fresh, the rationale is clear, and the cost is minimal. Attempting to document retroactively is more expensive and less accurate.
The data dictionary is the central repository of metadata about a data model. It defines every entity, attribute, relationship, and constraint with sufficient detail for unambiguous interpretation.
Components of a Comprehensive Data Dictionary
Entity Definitions: For each entity, document:
Attribute Definitions: For each attribute, document:
123456789101112131415161718192021222324252627282930313233343536373839404142
# Data Dictionary: Sales Domain ## Entity: Customer | Property | Value ||----------|-------|| **Definition** | A person or organization that has purchased from our company or is actively being pursued as a sales prospect. || **Aliases** | Client, Account (in legacy systems) || **Owner** | Sales Operations Team || **Source** | CRM System, E-commerce Platform || **Examples** | "Acme Corporation", "John Smith" | ### Attributes #### customer_id| Property | Value ||----------|-------|| **Definition** | System-generated unique identifier for each customer || **Data Type** | INTEGER (Logical); BIGINT (Physical) || **Nullability** | NOT NULL || **Domain** | Positive integers, auto-incremented || **Business Rules** | Immutable once assigned; never reused | #### customer_type| Property | Value ||----------|-------|| **Definition** | Classification of customer as individual or business || **Data Type** | CHAR(1) || **Nullability** | NOT NULL || **Domain** | 'I' = Individual, 'B' = Business || **Default** | 'I' || **Business Rules** | Cannot change from 'B' to 'I' once orders exist | #### credit_limit| Property | Value ||----------|-------|| **Definition** | Maximum amount of credit extended to this customer || **Data Type** | DECIMAL(10,2) || **Nullability** | NULL (indicates no credit extended; cash only) || **Domain** | 0.00 to 999999.99 || **Default** | NULL for new customers || **Business Rules** | Requires credit approval process for values > 10000 |Relationship Definitions
For each relationship, document:
Storage and Access
Data dictionaries can be stored in various formats:
| Format | Advantages | Disadvantages |
|---|---|---|
| Modeling tool metadata | Integrated with diagram; single source | Tool-specific; may be locked |
| Spreadsheet/CSV | Easy to edit; portable | Hard to enforce consistency |
| Wiki/Confluence | Searchable; linkable; accessible | Can drift from actual schema |
| Database catalog | Always synchronized with schema | Limited to what metadata is stored |
| Code comments | Close to implementation | Scattered; hard to aggregate |
Wherever you store your data dictionary, designate it as the authoritative source. If the diagram and dictionary disagree, one is wrong. Establish which is primary and keep the other synchronized.
Business rules govern how data is created, modified, and related. While some rules are expressed in the schema (constraints, relationships), many cannot be captured declaratively and must be documented.
Categories of Business Rules
Structural Rules: Expressed in the schema
Derivation Rules: How values are calculated
Action Rules: What happens on events
Process Rules: Sequencing and workflow
Validation Rules: Complex conditionals
| Element | Description | Example |
|---|---|---|
| Rule ID | Unique identifier for reference | BR-CUS-003 |
| Rule Name | Short descriptive name | Credit Limit Increase Approval |
| Description | Full explanation of the rule | Any credit limit increase above 50% of current limit or above $50,000 requires manager approval |
| Category | Type of rule | Action Rule |
| Affected Entities | Entities governed by this rule | Customer, CreditApproval |
| Affected Attributes | Specific attributes involved | credit_limit |
| Implementation | How/where the rule is enforced | Application layer; workflow system |
| Exceptions | Any exceptions to the rule | System administrators can override with audit log |
| Owner | Who is responsible | Credit Department |
| Source | Where the rule originates | Credit Policy Document v2.3 |
Business rules evolve with the business. Treat rule documentation as a living artifact. When rules change, update documentation immediately. Stale rule documentation is worse than no documentation—it misleads.
Rule-Entity Matrix
For complex models, maintain a matrix showing which rules affect which entities:
| Rule ID | Customer | Order | OrderLine | Product | Payment |
|---|---|---|---|---|---|
| BR-001 | ✓ | ||||
| BR-002 | ✓ | ✓ | |||
| BR-003 | ✓ | ✓ | |||
| BR-004 | ✓ |
This matrix helps impact analysis: when modifying an entity, you can quickly identify all rules that might be affected.
While data dictionaries provide comprehensive documentation, some information is best placed on the diagram itself as annotations. Annotations provide context without requiring separate document lookup.
Types of Diagram Annotations
Entity Notes: Inline explanations attached to entities
Relationship Notes: Clarifications on relationships
Attribute Notes: For attributes that need clarification
Annotation Callouts and Symbols
Use visual conventions to distinguish annotation types:
| Symbol/Style | Meaning | Example |
|---|---|---|
| Yellow sticky note | General comment/clarification | 'Historical data before 2015 migrated from legacy' |
| Red border | Warning/caution | 'Critical table: changes require DBA approval' |
| Blue info icon | See detailed documentation | 'See DD-CUS-003 for valid values' |
| Strikethrough | Deprecated/removed | '̶l̶e̶g̶a̶c̶y̶_̶c̶o̶d̶e̶ - Deprecated: use status_code' |
| Link icon | External reference | 'Integration spec: /docs/integrations/erp.md' |
Legend for Annotations
If your diagram uses annotation styles for semantic meaning, include a legend explaining them. Don't assume viewers know your conventions.
An annotation should be understandable within 15 seconds of reading. If it requires more, it belongs in a linked document. Annotations are signposts pointing to detailed information, not the information itself.
Data models evolve over time. Capturing what changed, when, why, and who made changes is essential for understanding the current model and managing future evolution.
Version Identification
Every diagram and documentation artifact should carry version information:
Versioning Scheme
| Change Type | Version Bump | Example |
|---|---|---|
| Major structural change (new subsystem, entity removal) | Major | 2.0.0 → 3.0.0 |
| Significant additions (new entities, relationships) | Minor | 2.3.0 → 2.4.0 |
| Minor changes (attribute additions, documentation fixes) | Patch | 2.3.1 → 2.3.2 |
Change Log
Maintain a structured change log documenting every modification:
# Change Log: Sales Domain Model
## Version 3.1.0 (2024-01-15)
- Added: Customer.preferred_language attribute (REQ-2024-023)
- Added: Relationship Customer-ShippingPreference (REQ-2024-023)
- Modified: Order.status_code expanded to support 'ON_HOLD' (BUG-2023-445)
- Author: J. Smith
- Reviewed by: A. Johnson
## Version 3.0.0 (2023-09-01)
- Added: PaymentMethod entity (REQ-2023-089)
- Added: Subscription entity (REQ-2023-102)
- Removed: LegacyCustomer entity (deprecated in v2.5)
- Breaking: Customer.payment_type moved to PaymentMethod relationship
- Author: M. Garcia
- Reviewed by: K. Williams
Every change should trace to a requirement, bug fix, or documented decision. This traceability answers the 'why' question that future engineers will ask: 'Why was this added?' 'REQ-2024-023' provides the answer.
Preserving Historical Versions
Don't overwrite previous versions—preserve them:
sales-model-v3.1.0.erdm, sales-model-v3.0.0.erdm/models/current/, /models/archive/Why This Matters:
Deprecated Element Handling
When elements are scheduled for removal:
@deprecated Since version 2.5.0
Scheduled for removal in version 3.0.0
Replacement: Use Customer.email_primary instead
Migration: Run script migrate_email_fields.sql
Metadata is data about data—the descriptions, definitions, and contextual information that give meaning to your data model. Effective metadata management integrates documentation with tools and processes.
Metadata Categories
Metadata Standards
Organizations benefit from adopting or creating metadata standards:
| Standard | Scope | Use Case |
|---|---|---|
| ISO 11179 | Metadata registry | Enterprise-wide definitions |
| Dublin Core | General metadata | Document and resource description |
| DCMI Terms | Extended Dublin Core | Richer description vocabulary |
| Schema.org | Web-focused | Publishing structured data |
| Custom Enterprise | Organization-specific | When no standard fits |
Metadata Integration Points
Metadata should flow between systems, not exist in silos:
Data Catalog Integration
Modern data catalogs (Alation, Collibra, Apache Atlas, AWS Glue Data Catalog) provide:
If your organization uses a data catalog, integrate your ER model documentation:
# Example: Push metadata to data catalog API
for entity in er_model.entities:
catalog_api.update_table(
name=entity.name,
description=entity.definition,
owner=entity.owner,
tags=entity.domain_tags
)
for attr in entity.attributes:
catalog_api.update_column(
table=entity.name,
column=attr.name,
description=attr.definition,
data_type=attr.data_type
)
Manual metadata synchronization between systems will eventually fail. Invest in automation—scripts, APIs, or integration tools—that keep metadata consistent. The ideal is a single authoritative source with automatic propagation to consuming systems.
Documentation is only valuable if it's accurate and current. Documentation governance establishes processes to ensure quality.
Roles and Responsibilities
| Role | Responsibilities | Typical Holder |
|---|---|---|
| Data Steward | Defines business meaning; validates accuracy | Business analyst or domain expert |
| Data Custodian | Maintains technical accuracy; updates schema docs | DBA or data engineer |
| Data Owner | Accountable for data quality and documentation | Department manager or executive |
| Documentation Lead | Enforces standards; coordinates updates | Data management team member |
Review Processes
On Creation: All new entities, attributes, and relationships require documentation as part of the design review. No undocumented elements approved.
On Modification: Changes require updated documentation. Pull request/change request includes documentation diff.
Periodic Review: Schedule regular audits (quarterly, semi-annually) to:
Usage-Triggered Review: When questions arise that documentation should answer but doesn't, flag for documentation improvement.
Just like technical debt, documentation debt accumulates when shortcuts are taken. 'We'll document it later' becomes never. Treat documentation as a first-class deliverable, not an afterthought.
Documentation transforms an ER diagram from a snapshot of structure into a living resource that supports understanding, maintenance, and evolution over time. The investment in comprehensive documentation pays dividends across the entire system lifecycle.
What's Next
Even experienced modelers make mistakes. The next page examines common mistakes in ER diagram creation—patterns of error that lead to flawed models, miscommunication, and costly rework. Recognizing these patterns helps you avoid them.
You now understand the principles and practices of documenting ER diagrams comprehensively. Documentation is the memory that enables your data model to remain useful, accurate, and maintainable for years to come.