Er Diagram Best Practices - Learning Module

Loading content...

0/241

Common Mistakes

Learning from the Mistakes of Others

The most efficient way to become an expert is to learn from mistakes—preferably other people's mistakes. In ER modeling, certain errors appear repeatedly across projects, organizations, and decades. These are not random accidents; they are predictable patterns arising from common misunderstandings, shortcuts, and blind spots.

Recognizing these patterns before you make them transforms your modeling practice. Instead of discovering through painful experience that 'this approach leads to problems,' you can identify and avoid the anti-pattern from the start.

The Cost of Modeling Mistakes

ER diagram mistakes don't stay localized. They propagate:

Conceptual mistake in the ER diagram leads to...
Structural flaw in the physical schema, which causes...
Application complexity as developers work around the flaw, resulting in...
Performance issues as workarounds prove inefficient, leading to...
Expensive refactoring when the problem can no longer be ignored

Catching a mistake at the diagram stage costs minutes. Fixing it after years of production data costs months.

What You Will Learn

By the end of this page, you will recognize common ER modeling mistakes across multiple categories: entity design errors, relationship mistakes, attribute problems, notation misuse, and conceptual misunderstandings. For each, you will understand why it's wrong and how to fix it.

Entity Design Mistakes

Entities are the foundation of the ER model. Mistakes here cascade through the entire design.

Mistake 1: Creating Entities for Properties

Problem: Modeling what should be an attribute as a separate entity.

WRONG:                          CORRECT:
[Customer]                      [Customer]
    |                               - name
    |---[has]---[CustomerName]      - email
    |                               - phone
    |---[has]---[CustomerEmail]

Why it's wrong: Creates unnecessary complexity. Every simple property lookup requires a join. The 'entities' have no independent existence or attributes of their own.

Exception: When a property has its own attributes (PhoneNumber with type, extension, country_code) or participates in multiple relationships, it may warrant entity status.

Mistake 2: God Entities (Too Many Attributes)

Problem: Creating entities with 50, 100, or more attributes that encompass multiple concepts.

Why it's wrong: Often indicates missed entity decomposition. Different attribute groups may have different lifecycles, optionality, or relationship patterns.

Fix: Analyze attribute groupings. If distinct clusters emerge (e.g., customer contact info vs. customer preferences vs. customer billing), consider entity decomposition or at least separate entity subtypes.

Entity Design Mistakes Summary
Mistake	Symptom	Fix
Properties as entities	Single-attribute entity; always 1:1 with parent	Merge as attribute of parent entity
God entity	50+ attributes; diverse concepts mixed	Decompose into cohesive entities
Missing entity	Repeated column groups across tables	Factor out as new entity
Duplicate entities	Same concept with different names	Consolidate; create synonym reference
Entity without identity	No natural or surrogate key; unclear uniqueness	Define clear identifying attributes

Mistake 3: Missing Entities (Hidden in Attributes)

Problem: Embedding what should be an entity as repeated attribute groups.

WRONG:                              CORRECT:
[Order]                             [Order]
    - ship_street                       |---[ships_to]---[Address]
    - ship_city                         |---[bills_to]---[Address]
    - ship_zip
    - bill_street
    - bill_city
    - bill_zip

Why it's wrong: Duplicates structure; no reuse. 'Address' is a thing with its own attributes and potential relationships.

Mistake 4: Confusion Between Entity Type and Instance

Problem: Creating an entity for a specific instance rather than a type.

WRONG:                              CORRECT:
[JohnSmithsAddress]                 [Address]

Why it's wrong: The model should represent types (Address), not specific instances (John Smith's address). Instances are data, not schema.

The Entity Litmus Test

Ask: 'Will there be multiple instances of this thing that we need to track separately?' If yes, it's likely an entity. Also ask: 'Does this thing have its own attributes beyond just a value?' If yes, entity. If it's just a value with no substructure, it's probably an attribute.

Relationship Mistakes

Relationships capture how entities interact. Mistakes here lead to incorrect cardinality enforcement, missing data associations, or unnecessary complexity.

Mistake 1: Wrong Cardinality

Problem: Specifying 1:1 when the relationship is actually 1:N, or vice versa.

Example: Modeling Customer-Order as 1:1 when customers can place multiple orders.

Consequences: Either the schema prevents valid data (can't store second order) or the model misleads readers about the actual structure.

Fix: Validate cardinality with domain experts: 'Can a customer have multiple orders?' 'Must every order have a customer?'

Mistake 2: Missing Relationships

Problem: Entities exist 'in isolation' when business reality connects them.

Symptom: To answer 'Which products did this customer order?', you need to join through Order and OrderLine, but no explicit relationship path is shown.

Fix: Walk through business queries during design. 'How do I find X related to Y?' If there's no path, a relationship is missing.

Mistake 3: Redundant Relationships

Problem: Multiple relationship paths express the same association.

WRONG:
[Customer]---[places]---[Order]
[Customer]-------[ordered_product]-------[Product]  (redundant)
[Order]---[contains]---[OrderLine]---[for]---[Product]

If Customer-Product association is always derivable from Customer→Order→OrderLine→Product, the direct Customer-Product relationship is redundant unless there's independent meaning.

Relationship Anti-Patterns

•1:1 when should be 1:N: Restricts valid data
•M:N when should be 1:N: Allows invalid combinations
•Missing relationship: Forces query complexity
•Redundant relationship: Update anomalies; inconsistency risk
•Unnamed relationship: 'Has' or 'Relates' provides no meaning
•Missing participation: Doesn't capture mandatory associations

Relationship Best Practices

•Validate cardinality with real scenarios
•Test with edge cases: zero, one, many instances
•Name relationships with active verbs
•Specify participation constraints explicitly
•Document relationship meaning when not obvious
•Check for redundancy by examining all paths

Mistake 4: Using Relationships Instead of Role-Based Entities

Problem: Modeling complex role patterns with multiple relationships instead of explicit role entities.

Example: Person plays multiple roles (Employee, Customer, Vendor). Instead of three relationships from Person, consider an explicit Role entity or subtyping.

Mistake 5: Ignoring Temporal Aspects

Problem: Modeling only current state when relationships change over time.

Example: Employee-Department as simple 1:N, when you need to track department history. After a transfer, you lose historical data.

Fix: If historical tracking matters, model time explicitly—either through history tables or by making the relationship an associative entity with effective dates.

The Fan Trap

A 'fan trap' occurs when two 1:N relationships fan out from an entity, creating incorrect aggregation paths. For example: Department→Employee and Department→Budget. If you try to join all three, you get cross-products. Recognize when independent relationships should not be combined in queries.

Attribute Mistakes

Attributes capture entity properties. Mistakes here lead to data integrity issues, query complexity, and maintenance headaches.

Mistake 1: Calculated Attributes Stored Without Clear Policy

Problem: Storing derived values (total_amount = sum of line items) without defining when they're updated.

Consequences: Stale data when underlying values change but derived value isn't recalculated.

Fix: Either don't store (calculate on demand), or document the update policy (trigger, batch job, application logic) and make it robust.

Mistake 2: Attributes with Embedded Structure

Problem: Storing structured data in a single attribute.

WRONG:
[Customer]
    - full_address ("123 Main St, Apt 4, Springfield, IL 62701")

CORRECT:
[Customer]
    - street_address
    - apt_number
    - city
    - state
    - postal_code

Why it's wrong: Can't query by city. Can't validate postal_code format. Parsing is error-prone.

Mistake 3: Repeating Groups (First Normal Form Violation)

Problem: Multiple values or repeating groups in a single attribute.

WRONG:
[Customer]
    - phone_numbers ("555-1234, 555-5678, 555-9012")

CORRECT:
[Customer]-1:N-[CustomerPhone]
                    - phone_number
                    - phone_type

Common Attribute Mistakes
Mistake	Example	Proper Approach
Derived without policy	`order_total` stored but not maintained	Document recalc trigger; or compute on demand
Embedded structure	`address` as single string	Decompose into structured attributes
Repeating groups	`phone1`, `phone2`, `phone3` columns	Create separate PhoneNumber entity
Polymorphic attribute	`value` that sometimes is date, sometimes is amount	Separate typed attributes or entities
Missing units	`weight` without indicating kg or lbs	Name includes unit: `weight_kg`
Encoded meaning	`status = 7` with no documentation	Use enum or lookup table; document codes

Mistake 4: Overloaded Attributes

Problem: A single attribute carries multiple meanings depending on context.

Example: person_id that sometimes refers to an employee, sometimes a customer, sometimes a vendor—with no indication of which.

Fix: Either create separate attributes (employee_id, customer_id) or add a discriminator attribute (person_type).

Mistake 5: Nullable When Should Be Required

Problem: Making attributes nullable 'to be safe' when business rules require values.

Consequences: NULLs pollute data; application code must handle them; reports produce incorrect results.

Fix: Rigorously determine: 'Can this attribute legitimately be unknown or inapplicable?' If not, enforce NOT NULL.

Mistake 6: Required When Should Be Nullable

Problem: Making attributes required when legitimate null cases exist.

Consequences: Forces fake data ('N/A', '0000-00-00', 'Unknown') which is worse than NULL.

Fix: If a value can legitimately be unknown or inapplicable, allow NULL and document what NULL means.

NULL Semantics

NULL is not a value—it's an indicator of missing or inapplicable data. Document what NULL means for each nullable attribute. 'NULL in birth_date means date is unknown' vs. 'NULL in termination_date means employee is still active.'

Notation and Diagram Mistakes

ER diagrams communicate through visual notation. Misusing notation causes misinterpretation.

Mistake 1: Mixing Notation Styles

Problem: Using Chen notation for some relationships and Crow's Foot for others in the same diagram.

Consequences: Readers must mentally switch between interpretation rules. Errors increase.

Fix: Choose one notation and use it consistently. Document your choice.

Mistake 2: Inconsistent Symbol Usage

Problem: Using rectangles for entities in one part of the diagram and ovals in another.

Fix: Establish and follow symbol conventions. Use a legend if non-standard symbols are needed.

Mistake 3: Missing or Ambiguous Cardinality Notation

Problem: Relationship lines without cardinality markers.

Consequences: Readers must guess: Is this 1:1? 1:N? M:N? The model is incomplete.

Fix: Always mark cardinality at both ends of every relationship. Use your notation's standard markers.

Notation Anti-Patterns

•Mixed notation styles in one diagram
•Missing cardinality notation
•Ambiguous participation constraints
•Overcrowded diagrams with crossing lines
•Tiny text unreadable when printed
•Missing legend for custom symbols
•No version or date on diagram

Notation Best Practices

•One consistent notation throughout
•Explicit cardinality at both ends
•Clear participation indicators
•Logical layout with minimal crossings
•Readable text at intended scale
•Legend explaining all symbols
•Title, version, and date included

Mistake 4: Confusing Conceptual and Physical

Problem: Physical implementation details (data types, indexes, table spaces) appearing on conceptual ER diagrams.

Consequences: Clutters the conceptual view; mixes concerns; confuses non-technical viewers.

Fix: Maintain separate conceptual and physical diagrams, or use tool features to hide/show levels of detail.

Mistake 5: Outdated Diagrams

Problem: The diagram no longer reflects the actual schema. Changes were made to the database but not the diagram.

Consequences: Misleads anyone using the diagram for understanding or design. Creates dangerous disconnect between documentation and reality.

Fix: Include diagrams in change management process. Consider generating diagrams from schema (reverse engineering) to ensure accuracy.

Mistake 6: Diagram Without Context

Problem: A diagram with no title, no version, no indication of scope, and no legend.

Fix: Every diagram should include:

Title (what domain/system)
Version and date
Scope indication (is this complete? partial?)
Legend for any non-standard notation
Author/owner contact

Conceptual Misunderstandings

Some mistakes stem from fundamental misunderstandings of ER modeling concepts.

Misunderstanding 1: Entity vs. Relationship Confusion

Problem: Not recognizing when a relationship should be promoted to an entity.

Example: A many-to-many relationship between Student and Course needs enrollment_date and grade. These are relationship attributes—but when the 'relationship' becomes complex enough, it's actually an entity (Enrollment).

Guideline: If a relationship has its own identity, lifecycle, or rich attribute set, model it as an entity.

Misunderstanding 2: Data vs. Metadata Confusion

Problem: Modeling data values as schema elements.

Example: Creating separate entities for each product category instead of a Category entity with category instances as data.

WRONG:                              CORRECT:
[Electronics]                       [Category]
[Clothing]                              - category_name
[Furniture]                             - [instances: Electronics, Clothing, ...]

Misunderstanding 3: Current-State Only Thinking

Problem: Modeling only the current state when business processes involve temporal data.

Example: Employee-Department shows current assignment. What about transfers? Promotions? Historical reporting?

Fix: Identify whether historical tracking is needed. If so, model time explicitly through effective dates or temporal patterns.

Conceptual Understanding Checklist
Question to Ask	If Answer is Yes...	Consider...
Does this relationship have its own attributes?	It might be an entity	Model as associative entity
Are there multiple specific values being modeled as entities?	These might be data, not schema	Creating a single entity with value instances
Do we need to track history?	Current-state model is insufficient	Adding temporal attributes or versioning pattern
Is this entity type or instance?	Specific instances don't belong in schema	Generalizing to the type level
Could this attribute have multiple values?	1NF violation possible	Normalizing to separate entity

Misunderstanding 4: Premature Physical Thinking

Problem: Making conceptual modeling decisions based on physical implementation concerns.

Example: 'We'll put address in the Customer table for performance' before even establishing the conceptual model.

Fix: Separate conceptual modeling (what?) from physical modeling (how?). Denormalization and optimization are physical decisions that should come after establishing correct conceptual structure.

Misunderstanding 5: Ignoring Business Rules

Problem: Drawing entities and relationships without encoding business constraints.

Example: The model shows Employee-Project but doesn't capture that 'a project must have at least one employee' or 'managers cannot be assigned to more than 3 projects.'

Fix: Business rules are part of the model, even if they can't all be drawn graphically. Capture them in annotations or accompanying documentation.

The Tool Trap

Don't let the diagramming tool limit your thinking. The tool shows what's easily drawable. Many important constraints (complex check conditions, temporal rules, multi-entity validations) don't have standard graphical representations but must still be captured in the model—through annotations or documentation.

Design Anti-Patterns

Beyond individual mistakes, some problematic patterns emerge at a larger scale.

Anti-Pattern 1: The God Table

A single entity that tries to do everything—Customer has 200 attributes covering contact info, preferences, billing, shipping history, communication history, demographics, segmentation, marketing consent...

Problems:

Wide rows with many NULLs
Change to one concern affects the entire structure
Unclear ownership
Performance issues

Fix: Decompose by cohesive concern. CustomerContact, CustomerPreferences, CustomerBilling as separate entities or 1:1 extensions.

Anti-Pattern 2: The EAV Escape Hatch

Entity-Attribute-Value pattern where extensibility is needed, but applied too broadly:

[CustomerAttribute]
    - customer_id
    - attribute_name  ("email", "phone", "shoe_size")
    - attribute_value ("john@email.com", "555-1234", "10")

Problems:

No type safety
No referential integrity on attribute_name
Query complexity (pivot required)
No schema enforcement

Fix: Use EAV sparingly—for truly dynamic attributes only. Core attributes belong in typed columns.

Design Anti-Patterns
Anti-Pattern	Symptom	Consequence	Fix
God Table	200+ columns; diverse purposes	Maintenance nightmare; NULL pollution	Decompose into cohesive entities
EAV Overuse	Generic attribute-value structure for core data	No schema enforcement; query complexity	Limit EAV to truly dynamic extensions
Circular Dependencies	A→B→C→A referential paths	Insert/delete complexity; deadlock risk	Re-examine relationships; break cycle
Premature Denormalization	Repeated data 'for performance'	Update anomalies; inconsistency	Normalize first; denormalize with metrics to justify
Star Schema in OLTP	Dimensional model for transactional system	Insert performance issues; update complexity	Match model to use case (OLTP vs OLAP)

Anti-Pattern 3: Over-Normalization

Yes, there is such a thing. Breaking entities into too many pieces creates:

Excessive joins for simple queries
Complexity that obscures the model
Performance overhead

Example: Normalizing 'FullName' into FirstName, MiddleName, LastName, Prefix, Suffix as separate entities with relationships—when they're always accessed together.

Fix: Normalization serves data integrity. If the complexity doesn't improve integrity (no independent update scenarios), it may be over-engineering.

Anti-Pattern 4: The Inheritance Abyss

Deep specialization hierarchies: Person → Employee → SalariedEmployee → Manager → SeniorManager → ExecutiveManager → ...

Problems:

Query complexity (which type is this instance?)
Mapping complexity (table-per-class, table-per-hierarchy, table-per-concrete-class?)
Brittle to real-world changes (person becomes contractor, then employee again)

Fix: Limit hierarchy depth. Consider composition over inheritance. Question whether deep specialization reflects stable reality.

The Smell Test

When a model feels wrong—too complex for its purpose, too rigid for business change, too hard to explain—trust that instinct. The 'smell' often indicates an anti-pattern. Stepping back and questioning assumptions often reveals a simpler, more correct design.

Process Mistakes

Beyond modeling decisions, mistakes in the modeling process lead to poor outcomes.

Process Mistake 1: Modeling in Isolation

Problem: The data modeler works alone, presenting a 'finished' model to stakeholders.

Consequences: Misses domain knowledge. Stakeholders don't understand or buy into the model. Errors discovered late.

Fix: Collaborative modeling. Work with domain experts, review incrementally, validate continuously.

Process Mistake 2: Skipping Requirements Validation

Problem: Building the model before confirming what data is actually needed.

Consequences: Missing entities for overlooked use cases; extra entities for abandoned features; mismatch with application needs.

Fix: Start from use cases and queries. 'What questions will the system answer?' drives entity identification.

Process Mistake 3: No Iteration

Problem: Treating the first draft as final.

Consequences: Early mistakes hardened before they're recognized.

Fix: Plan for multiple design iterations. First draft → Review → Revise → Prototype → Revise → Finalize.

Process Best Practices

•Involve domain experts from the start—they know edges cases and exceptions
•Start from use cases—what queries, reports, and operations must the model support?
•Iterate explicitly—schedule review cycles; expect changes
•Prototype early—validate with sample data before finalizing
•Document decisions—record why choices were made, not just what
•Review with diverse audiences—developers, DBAs, analysts, business users
•Plan for evolution—the model will change; design for modification

The 'Just This Once' Danger

Be wary of design shortcuts justified by 'just this once' or 'we'll fix it later.' Technical debt accumulates. If you recognize you're taking a shortcut, at minimum document it explicitly with a planned resolution. 'Temporary' workarounds have a way of becoming permanent.

Summary: Learning to See Mistakes Before They Happen

Recognizing common mistakes is a form of pattern recognition. Once you've seen these anti-patterns described, you'll spot them in your own work and in models you inherit. This recognition is the first step toward prevention.

Key Takeaways

•Entity mistakes include properties-as-entities, god entities, missing entities, and instance-vs-type confusion.
•Relationship mistakes include wrong cardinality, missing relationships, redundant relationships, and ignoring temporal aspects.
•Attribute mistakes include undocumented derivations, embedded structure, repeating groups, and mishandled nullability.
•Notation mistakes include mixed styles, missing cardinality, and outdated diagrams.
•Conceptual mistakes include entity-relationship confusion, data-metadata confusion, and premature physical thinking.
•Design anti-patterns include god tables, EAV overuse, over-normalization, and deep inheritance.
•Process mistakes include isolation, skipped validation, and no iteration.

What's Next

The final page of this module provides a Review Checklist—a systematic approach to validating ER diagrams before they progress to implementation. The checklist synthesizes everything we've learned into a practical review tool.

Page Complete

You now have a catalog of common ER modeling mistakes across multiple dimensions. This knowledge serves as a defensive tool—a set of warning signs to watch for in your own modeling work and when reviewing others' designs.

Common Mistakes

Learning from the Mistakes of Others

The Cost of Modeling Mistakes

ER diagram mistakes don't stay localized. They propagate:

Conceptual mistake in the ER diagram leads to...
Structural flaw in the physical schema, which causes...
Application complexity as developers work around the flaw, resulting in...
Performance issues as workarounds prove inefficient, leading to...
Expensive refactoring when the problem can no longer be ignored

Catching a mistake at the diagram stage costs minutes. Fixing it after years of production data costs months.

What You Will Learn

Entity Design Mistakes

Entities are the foundation of the ER model. Mistakes here cascade through the entire design.

Mistake 1: Creating Entities for Properties

Problem: Modeling what should be an attribute as a separate entity.

WRONG:                          CORRECT:
[Customer]                      [Customer]
    |                               - name
    |---[has]---[CustomerName]      - email
    |                               - phone
    |---[has]---[CustomerEmail]

Why it's wrong: Creates unnecessary complexity. Every simple property lookup requires a join. The 'entities' have no independent existence or attributes of their own.

Exception: When a property has its own attributes (PhoneNumber with type, extension, country_code) or participates in multiple relationships, it may warrant entity status.

Mistake 2: God Entities (Too Many Attributes)

Problem: Creating entities with 50, 100, or more attributes that encompass multiple concepts.

Why it's wrong: Often indicates missed entity decomposition. Different attribute groups may have different lifecycles, optionality, or relationship patterns.

Entity Design Mistakes Summary
Mistake	Symptom	Fix
Properties as entities	Single-attribute entity; always 1:1 with parent	Merge as attribute of parent entity
God entity	50+ attributes; diverse concepts mixed	Decompose into cohesive entities
Missing entity	Repeated column groups across tables	Factor out as new entity
Duplicate entities	Same concept with different names	Consolidate; create synonym reference
Entity without identity	No natural or surrogate key; unclear uniqueness	Define clear identifying attributes

Mistake 3: Missing Entities (Hidden in Attributes)

Problem: Embedding what should be an entity as repeated attribute groups.

WRONG:                              CORRECT:
[Order]                             [Order]
    - ship_street                       |---[ships_to]---[Address]
    - ship_city                         |---[bills_to]---[Address]
    - ship_zip
    - bill_street
    - bill_city
    - bill_zip

Why it's wrong: Duplicates structure; no reuse. 'Address' is a thing with its own attributes and potential relationships.

Mistake 4: Confusion Between Entity Type and Instance

Problem: Creating an entity for a specific instance rather than a type.

WRONG:                              CORRECT:
[JohnSmithsAddress]                 [Address]

Why it's wrong: The model should represent types (Address), not specific instances (John Smith's address). Instances are data, not schema.

The Entity Litmus Test

Relationship Mistakes

Relationships capture how entities interact. Mistakes here lead to incorrect cardinality enforcement, missing data associations, or unnecessary complexity.

Mistake 1: Wrong Cardinality

Problem: Specifying 1:1 when the relationship is actually 1:N, or vice versa.

Example: Modeling Customer-Order as 1:1 when customers can place multiple orders.

Consequences: Either the schema prevents valid data (can't store second order) or the model misleads readers about the actual structure.

Fix: Validate cardinality with domain experts: 'Can a customer have multiple orders?' 'Must every order have a customer?'

Mistake 2: Missing Relationships

Problem: Entities exist 'in isolation' when business reality connects them.

Symptom: To answer 'Which products did this customer order?', you need to join through Order and OrderLine, but no explicit relationship path is shown.

Fix: Walk through business queries during design. 'How do I find X related to Y?' If there's no path, a relationship is missing.

Mistake 3: Redundant Relationships

Problem: Multiple relationship paths express the same association.

WRONG:
[Customer]---[places]---[Order]
[Customer]-------[ordered_product]-------[Product]  (redundant)
[Order]---[contains]---[OrderLine]---[for]---[Product]

If Customer-Product association is always derivable from Customer→Order→OrderLine→Product, the direct Customer-Product relationship is redundant unless there's independent meaning.

Relationship Anti-Patterns

•1:1 when should be 1:N: Restricts valid data
•M:N when should be 1:N: Allows invalid combinations
•Missing relationship: Forces query complexity
•Redundant relationship: Update anomalies; inconsistency risk
•Unnamed relationship: 'Has' or 'Relates' provides no meaning
•Missing participation: Doesn't capture mandatory associations

Relationship Best Practices

•Validate cardinality with real scenarios
•Test with edge cases: zero, one, many instances
•Name relationships with active verbs
•Specify participation constraints explicitly
•Document relationship meaning when not obvious
•Check for redundancy by examining all paths

Mistake 4: Using Relationships Instead of Role-Based Entities

Problem: Modeling complex role patterns with multiple relationships instead of explicit role entities.

Example: Person plays multiple roles (Employee, Customer, Vendor). Instead of three relationships from Person, consider an explicit Role entity or subtyping.

Mistake 5: Ignoring Temporal Aspects

Problem: Modeling only current state when relationships change over time.

Example: Employee-Department as simple 1:N, when you need to track department history. After a transfer, you lose historical data.

Fix: If historical tracking matters, model time explicitly—either through history tables or by making the relationship an associative entity with effective dates.

The Fan Trap

Attribute Mistakes

Attributes capture entity properties. Mistakes here lead to data integrity issues, query complexity, and maintenance headaches.

Mistake 1: Calculated Attributes Stored Without Clear Policy

Problem: Storing derived values (total_amount = sum of line items) without defining when they're updated.

Consequences: Stale data when underlying values change but derived value isn't recalculated.

Fix: Either don't store (calculate on demand), or document the update policy (trigger, batch job, application logic) and make it robust.

Mistake 2: Attributes with Embedded Structure

Problem: Storing structured data in a single attribute.

WRONG:
[Customer]
    - full_address ("123 Main St, Apt 4, Springfield, IL 62701")

CORRECT:
[Customer]
    - street_address
    - apt_number
    - city
    - state
    - postal_code

Why it's wrong: Can't query by city. Can't validate postal_code format. Parsing is error-prone.

Mistake 3: Repeating Groups (First Normal Form Violation)

Problem: Multiple values or repeating groups in a single attribute.

WRONG:
[Customer]
    - phone_numbers ("555-1234, 555-5678, 555-9012")

CORRECT:
[Customer]-1:N-[CustomerPhone]
                    - phone_number
                    - phone_type

Common Attribute Mistakes
Mistake	Example	Proper Approach
Derived without policy	`order_total` stored but not maintained	Document recalc trigger; or compute on demand
Embedded structure	`address` as single string	Decompose into structured attributes
Repeating groups	`phone1`, `phone2`, `phone3` columns	Create separate PhoneNumber entity
Polymorphic attribute	`value` that sometimes is date, sometimes is amount	Separate typed attributes or entities
Missing units	`weight` without indicating kg or lbs	Name includes unit: `weight_kg`
Encoded meaning	`status = 7` with no documentation	Use enum or lookup table; document codes

Mistake 4: Overloaded Attributes

Problem: A single attribute carries multiple meanings depending on context.

Example: person_id that sometimes refers to an employee, sometimes a customer, sometimes a vendor—with no indication of which.

Fix: Either create separate attributes (employee_id, customer_id) or add a discriminator attribute (person_type).

Mistake 5: Nullable When Should Be Required

Problem: Making attributes nullable 'to be safe' when business rules require values.

Consequences: NULLs pollute data; application code must handle them; reports produce incorrect results.

Fix: Rigorously determine: 'Can this attribute legitimately be unknown or inapplicable?' If not, enforce NOT NULL.

Mistake 6: Required When Should Be Nullable

Problem: Making attributes required when legitimate null cases exist.

Consequences: Forces fake data ('N/A', '0000-00-00', 'Unknown') which is worse than NULL.

Fix: If a value can legitimately be unknown or inapplicable, allow NULL and document what NULL means.

NULL Semantics

Notation and Diagram Mistakes

ER diagrams communicate through visual notation. Misusing notation causes misinterpretation.

Mistake 1: Mixing Notation Styles

Problem: Using Chen notation for some relationships and Crow's Foot for others in the same diagram.

Consequences: Readers must mentally switch between interpretation rules. Errors increase.

Fix: Choose one notation and use it consistently. Document your choice.

Mistake 2: Inconsistent Symbol Usage

Problem: Using rectangles for entities in one part of the diagram and ovals in another.

Fix: Establish and follow symbol conventions. Use a legend if non-standard symbols are needed.

Mistake 3: Missing or Ambiguous Cardinality Notation

Problem: Relationship lines without cardinality markers.

Consequences: Readers must guess: Is this 1:1? 1:N? M:N? The model is incomplete.

Fix: Always mark cardinality at both ends of every relationship. Use your notation's standard markers.

Notation Anti-Patterns

•Mixed notation styles in one diagram
•Missing cardinality notation
•Ambiguous participation constraints
•Overcrowded diagrams with crossing lines
•Tiny text unreadable when printed
•Missing legend for custom symbols
•No version or date on diagram

Notation Best Practices

•One consistent notation throughout
•Explicit cardinality at both ends
•Clear participation indicators
•Logical layout with minimal crossings
•Readable text at intended scale
•Legend explaining all symbols
•Title, version, and date included

Mistake 4: Confusing Conceptual and Physical

Problem: Physical implementation details (data types, indexes, table spaces) appearing on conceptual ER diagrams.

Consequences: Clutters the conceptual view; mixes concerns; confuses non-technical viewers.

Fix: Maintain separate conceptual and physical diagrams, or use tool features to hide/show levels of detail.

Mistake 5: Outdated Diagrams

Problem: The diagram no longer reflects the actual schema. Changes were made to the database but not the diagram.

Consequences: Misleads anyone using the diagram for understanding or design. Creates dangerous disconnect between documentation and reality.

Fix: Include diagrams in change management process. Consider generating diagrams from schema (reverse engineering) to ensure accuracy.

Mistake 6: Diagram Without Context

Problem: A diagram with no title, no version, no indication of scope, and no legend.

Fix: Every diagram should include:

Title (what domain/system)
Version and date
Scope indication (is this complete? partial?)
Legend for any non-standard notation
Author/owner contact

Conceptual Misunderstandings

Some mistakes stem from fundamental misunderstandings of ER modeling concepts.

Misunderstanding 1: Entity vs. Relationship Confusion

Problem: Not recognizing when a relationship should be promoted to an entity.

Guideline: If a relationship has its own identity, lifecycle, or rich attribute set, model it as an entity.

Misunderstanding 2: Data vs. Metadata Confusion

Problem: Modeling data values as schema elements.

Example: Creating separate entities for each product category instead of a Category entity with category instances as data.

WRONG:                              CORRECT:
[Electronics]                       [Category]
[Clothing]                              - category_name
[Furniture]                             - [instances: Electronics, Clothing, ...]

Misunderstanding 3: Current-State Only Thinking

Problem: Modeling only the current state when business processes involve temporal data.

Example: Employee-Department shows current assignment. What about transfers? Promotions? Historical reporting?

Fix: Identify whether historical tracking is needed. If so, model time explicitly through effective dates or temporal patterns.

Conceptual Understanding Checklist
Question to Ask	If Answer is Yes...	Consider...
Does this relationship have its own attributes?	It might be an entity	Model as associative entity
Are there multiple specific values being modeled as entities?	These might be data, not schema	Creating a single entity with value instances
Do we need to track history?	Current-state model is insufficient	Adding temporal attributes or versioning pattern
Is this entity type or instance?	Specific instances don't belong in schema	Generalizing to the type level
Could this attribute have multiple values?	1NF violation possible	Normalizing to separate entity

Misunderstanding 4: Premature Physical Thinking

Problem: Making conceptual modeling decisions based on physical implementation concerns.

Example: 'We'll put address in the Customer table for performance' before even establishing the conceptual model.

Fix: Separate conceptual modeling (what?) from physical modeling (how?). Denormalization and optimization are physical decisions that should come after establishing correct conceptual structure.

Misunderstanding 5: Ignoring Business Rules

Problem: Drawing entities and relationships without encoding business constraints.

Example: The model shows Employee-Project but doesn't capture that 'a project must have at least one employee' or 'managers cannot be assigned to more than 3 projects.'

Fix: Business rules are part of the model, even if they can't all be drawn graphically. Capture them in annotations or accompanying documentation.

The Tool Trap

Design Anti-Patterns

Beyond individual mistakes, some problematic patterns emerge at a larger scale.

Anti-Pattern 1: The God Table

Problems:

Wide rows with many NULLs
Change to one concern affects the entire structure
Unclear ownership
Performance issues

Fix: Decompose by cohesive concern. CustomerContact, CustomerPreferences, CustomerBilling as separate entities or 1:1 extensions.

Anti-Pattern 2: The EAV Escape Hatch

Entity-Attribute-Value pattern where extensibility is needed, but applied too broadly:

[CustomerAttribute]
    - customer_id
    - attribute_name  ("email", "phone", "shoe_size")
    - attribute_value ("john@email.com", "555-1234", "10")

Problems:

No type safety
No referential integrity on attribute_name
Query complexity (pivot required)
No schema enforcement

Fix: Use EAV sparingly—for truly dynamic attributes only. Core attributes belong in typed columns.

Design Anti-Patterns
Anti-Pattern	Symptom	Consequence	Fix
God Table	200+ columns; diverse purposes	Maintenance nightmare; NULL pollution	Decompose into cohesive entities
EAV Overuse	Generic attribute-value structure for core data	No schema enforcement; query complexity	Limit EAV to truly dynamic extensions
Circular Dependencies	A→B→C→A referential paths	Insert/delete complexity; deadlock risk	Re-examine relationships; break cycle
Premature Denormalization	Repeated data 'for performance'	Update anomalies; inconsistency	Normalize first; denormalize with metrics to justify
Star Schema in OLTP	Dimensional model for transactional system	Insert performance issues; update complexity	Match model to use case (OLTP vs OLAP)

Anti-Pattern 3: Over-Normalization

Yes, there is such a thing. Breaking entities into too many pieces creates:

Excessive joins for simple queries
Complexity that obscures the model
Performance overhead

Example: Normalizing 'FullName' into FirstName, MiddleName, LastName, Prefix, Suffix as separate entities with relationships—when they're always accessed together.

Fix: Normalization serves data integrity. If the complexity doesn't improve integrity (no independent update scenarios), it may be over-engineering.

Anti-Pattern 4: The Inheritance Abyss

Deep specialization hierarchies: Person → Employee → SalariedEmployee → Manager → SeniorManager → ExecutiveManager → ...

Problems:

Query complexity (which type is this instance?)
Mapping complexity (table-per-class, table-per-hierarchy, table-per-concrete-class?)
Brittle to real-world changes (person becomes contractor, then employee again)

Fix: Limit hierarchy depth. Consider composition over inheritance. Question whether deep specialization reflects stable reality.

The Smell Test

Process Mistakes

Beyond modeling decisions, mistakes in the modeling process lead to poor outcomes.

Process Mistake 1: Modeling in Isolation

Problem: The data modeler works alone, presenting a 'finished' model to stakeholders.

Consequences: Misses domain knowledge. Stakeholders don't understand or buy into the model. Errors discovered late.

Fix: Collaborative modeling. Work with domain experts, review incrementally, validate continuously.

Process Mistake 2: Skipping Requirements Validation

Problem: Building the model before confirming what data is actually needed.

Consequences: Missing entities for overlooked use cases; extra entities for abandoned features; mismatch with application needs.

Fix: Start from use cases and queries. 'What questions will the system answer?' drives entity identification.

Process Mistake 3: No Iteration

Problem: Treating the first draft as final.

Consequences: Early mistakes hardened before they're recognized.

Fix: Plan for multiple design iterations. First draft → Review → Revise → Prototype → Revise → Finalize.

Process Best Practices

•Involve domain experts from the start—they know edges cases and exceptions
•Start from use cases—what queries, reports, and operations must the model support?
•Iterate explicitly—schedule review cycles; expect changes
•Prototype early—validate with sample data before finalizing
•Document decisions—record why choices were made, not just what
•Review with diverse audiences—developers, DBAs, analysts, business users
•Plan for evolution—the model will change; design for modification

The 'Just This Once' Danger

Summary: Learning to See Mistakes Before They Happen

Key Takeaways

•Entity mistakes include properties-as-entities, god entities, missing entities, and instance-vs-type confusion.
•Relationship mistakes include wrong cardinality, missing relationships, redundant relationships, and ignoring temporal aspects.
•Attribute mistakes include undocumented derivations, embedded structure, repeating groups, and mishandled nullability.
•Notation mistakes include mixed styles, missing cardinality, and outdated diagrams.
•Conceptual mistakes include entity-relationship confusion, data-metadata confusion, and premature physical thinking.
•Design anti-patterns include god tables, EAV overuse, over-normalization, and deep inheritance.
•Process mistakes include isolation, skipped validation, and no iteration.

What's Next

Page Complete