Loading content...
The most efficient way to become an expert is to learn from mistakes—preferably other people's mistakes. In ER modeling, certain errors appear repeatedly across projects, organizations, and decades. These are not random accidents; they are predictable patterns arising from common misunderstandings, shortcuts, and blind spots.
Recognizing these patterns before you make them transforms your modeling practice. Instead of discovering through painful experience that 'this approach leads to problems,' you can identify and avoid the anti-pattern from the start.
The Cost of Modeling Mistakes
ER diagram mistakes don't stay localized. They propagate:
Catching a mistake at the diagram stage costs minutes. Fixing it after years of production data costs months.
By the end of this page, you will recognize common ER modeling mistakes across multiple categories: entity design errors, relationship mistakes, attribute problems, notation misuse, and conceptual misunderstandings. For each, you will understand why it's wrong and how to fix it.
Entities are the foundation of the ER model. Mistakes here cascade through the entire design.
Mistake 1: Creating Entities for Properties
Problem: Modeling what should be an attribute as a separate entity.
WRONG: CORRECT:
[Customer] [Customer]
| - name
|---[has]---[CustomerName] - email
| - phone
|---[has]---[CustomerEmail]
Why it's wrong: Creates unnecessary complexity. Every simple property lookup requires a join. The 'entities' have no independent existence or attributes of their own.
Exception: When a property has its own attributes (PhoneNumber with type, extension, country_code) or participates in multiple relationships, it may warrant entity status.
Mistake 2: God Entities (Too Many Attributes)
Problem: Creating entities with 50, 100, or more attributes that encompass multiple concepts.
Why it's wrong: Often indicates missed entity decomposition. Different attribute groups may have different lifecycles, optionality, or relationship patterns.
Fix: Analyze attribute groupings. If distinct clusters emerge (e.g., customer contact info vs. customer preferences vs. customer billing), consider entity decomposition or at least separate entity subtypes.
| Mistake | Symptom | Fix |
|---|---|---|
| Properties as entities | Single-attribute entity; always 1:1 with parent | Merge as attribute of parent entity |
| God entity | 50+ attributes; diverse concepts mixed | Decompose into cohesive entities |
| Missing entity | Repeated column groups across tables | Factor out as new entity |
| Duplicate entities | Same concept with different names | Consolidate; create synonym reference |
| Entity without identity | No natural or surrogate key; unclear uniqueness | Define clear identifying attributes |
Mistake 3: Missing Entities (Hidden in Attributes)
Problem: Embedding what should be an entity as repeated attribute groups.
WRONG: CORRECT:
[Order] [Order]
- ship_street |---[ships_to]---[Address]
- ship_city |---[bills_to]---[Address]
- ship_zip
- bill_street
- bill_city
- bill_zip
Why it's wrong: Duplicates structure; no reuse. 'Address' is a thing with its own attributes and potential relationships.
Mistake 4: Confusion Between Entity Type and Instance
Problem: Creating an entity for a specific instance rather than a type.
WRONG: CORRECT:
[JohnSmithsAddress] [Address]
Why it's wrong: The model should represent types (Address), not specific instances (John Smith's address). Instances are data, not schema.
Ask: 'Will there be multiple instances of this thing that we need to track separately?' If yes, it's likely an entity. Also ask: 'Does this thing have its own attributes beyond just a value?' If yes, entity. If it's just a value with no substructure, it's probably an attribute.
Relationships capture how entities interact. Mistakes here lead to incorrect cardinality enforcement, missing data associations, or unnecessary complexity.
Mistake 1: Wrong Cardinality
Problem: Specifying 1:1 when the relationship is actually 1:N, or vice versa.
Example: Modeling Customer-Order as 1:1 when customers can place multiple orders.
Consequences: Either the schema prevents valid data (can't store second order) or the model misleads readers about the actual structure.
Fix: Validate cardinality with domain experts: 'Can a customer have multiple orders?' 'Must every order have a customer?'
Mistake 2: Missing Relationships
Problem: Entities exist 'in isolation' when business reality connects them.
Symptom: To answer 'Which products did this customer order?', you need to join through Order and OrderLine, but no explicit relationship path is shown.
Fix: Walk through business queries during design. 'How do I find X related to Y?' If there's no path, a relationship is missing.
Mistake 3: Redundant Relationships
Problem: Multiple relationship paths express the same association.
WRONG:
[Customer]---[places]---[Order]
[Customer]-------[ordered_product]-------[Product] (redundant)
[Order]---[contains]---[OrderLine]---[for]---[Product]
If Customer-Product association is always derivable from Customer→Order→OrderLine→Product, the direct Customer-Product relationship is redundant unless there's independent meaning.
Mistake 4: Using Relationships Instead of Role-Based Entities
Problem: Modeling complex role patterns with multiple relationships instead of explicit role entities.
Example: Person plays multiple roles (Employee, Customer, Vendor). Instead of three relationships from Person, consider an explicit Role entity or subtyping.
Mistake 5: Ignoring Temporal Aspects
Problem: Modeling only current state when relationships change over time.
Example: Employee-Department as simple 1:N, when you need to track department history. After a transfer, you lose historical data.
Fix: If historical tracking matters, model time explicitly—either through history tables or by making the relationship an associative entity with effective dates.
A 'fan trap' occurs when two 1:N relationships fan out from an entity, creating incorrect aggregation paths. For example: Department→Employee and Department→Budget. If you try to join all three, you get cross-products. Recognize when independent relationships should not be combined in queries.
Attributes capture entity properties. Mistakes here lead to data integrity issues, query complexity, and maintenance headaches.
Mistake 1: Calculated Attributes Stored Without Clear Policy
Problem: Storing derived values (total_amount = sum of line items) without defining when they're updated.
Consequences: Stale data when underlying values change but derived value isn't recalculated.
Fix: Either don't store (calculate on demand), or document the update policy (trigger, batch job, application logic) and make it robust.
Mistake 2: Attributes with Embedded Structure
Problem: Storing structured data in a single attribute.
WRONG:
[Customer]
- full_address ("123 Main St, Apt 4, Springfield, IL 62701")
CORRECT:
[Customer]
- street_address
- apt_number
- city
- state
- postal_code
Why it's wrong: Can't query by city. Can't validate postal_code format. Parsing is error-prone.
Mistake 3: Repeating Groups (First Normal Form Violation)
Problem: Multiple values or repeating groups in a single attribute.
WRONG:
[Customer]
- phone_numbers ("555-1234, 555-5678, 555-9012")
CORRECT:
[Customer]-1:N-[CustomerPhone]
- phone_number
- phone_type
| Mistake | Example | Proper Approach |
|---|---|---|
| Derived without policy | order_total stored but not maintained | Document recalc trigger; or compute on demand |
| Embedded structure | address as single string | Decompose into structured attributes |
| Repeating groups | phone1, phone2, phone3 columns | Create separate PhoneNumber entity |
| Polymorphic attribute | value that sometimes is date, sometimes is amount | Separate typed attributes or entities |
| Missing units | weight without indicating kg or lbs | Name includes unit: weight_kg |
| Encoded meaning | status = 7 with no documentation | Use enum or lookup table; document codes |
Mistake 4: Overloaded Attributes
Problem: A single attribute carries multiple meanings depending on context.
Example: person_id that sometimes refers to an employee, sometimes a customer, sometimes a vendor—with no indication of which.
Fix: Either create separate attributes (employee_id, customer_id) or add a discriminator attribute (person_type).
Mistake 5: Nullable When Should Be Required
Problem: Making attributes nullable 'to be safe' when business rules require values.
Consequences: NULLs pollute data; application code must handle them; reports produce incorrect results.
Fix: Rigorously determine: 'Can this attribute legitimately be unknown or inapplicable?' If not, enforce NOT NULL.
Mistake 6: Required When Should Be Nullable
Problem: Making attributes required when legitimate null cases exist.
Consequences: Forces fake data ('N/A', '0000-00-00', 'Unknown') which is worse than NULL.
Fix: If a value can legitimately be unknown or inapplicable, allow NULL and document what NULL means.
NULL is not a value—it's an indicator of missing or inapplicable data. Document what NULL means for each nullable attribute. 'NULL in birth_date means date is unknown' vs. 'NULL in termination_date means employee is still active.'
ER diagrams communicate through visual notation. Misusing notation causes misinterpretation.
Mistake 1: Mixing Notation Styles
Problem: Using Chen notation for some relationships and Crow's Foot for others in the same diagram.
Consequences: Readers must mentally switch between interpretation rules. Errors increase.
Fix: Choose one notation and use it consistently. Document your choice.
Mistake 2: Inconsistent Symbol Usage
Problem: Using rectangles for entities in one part of the diagram and ovals in another.
Fix: Establish and follow symbol conventions. Use a legend if non-standard symbols are needed.
Mistake 3: Missing or Ambiguous Cardinality Notation
Problem: Relationship lines without cardinality markers.
Consequences: Readers must guess: Is this 1:1? 1:N? M:N? The model is incomplete.
Fix: Always mark cardinality at both ends of every relationship. Use your notation's standard markers.
Mistake 4: Confusing Conceptual and Physical
Problem: Physical implementation details (data types, indexes, table spaces) appearing on conceptual ER diagrams.
Consequences: Clutters the conceptual view; mixes concerns; confuses non-technical viewers.
Fix: Maintain separate conceptual and physical diagrams, or use tool features to hide/show levels of detail.
Mistake 5: Outdated Diagrams
Problem: The diagram no longer reflects the actual schema. Changes were made to the database but not the diagram.
Consequences: Misleads anyone using the diagram for understanding or design. Creates dangerous disconnect between documentation and reality.
Fix: Include diagrams in change management process. Consider generating diagrams from schema (reverse engineering) to ensure accuracy.
Mistake 6: Diagram Without Context
Problem: A diagram with no title, no version, no indication of scope, and no legend.
Fix: Every diagram should include:
Some mistakes stem from fundamental misunderstandings of ER modeling concepts.
Misunderstanding 1: Entity vs. Relationship Confusion
Problem: Not recognizing when a relationship should be promoted to an entity.
Example: A many-to-many relationship between Student and Course needs enrollment_date and grade. These are relationship attributes—but when the 'relationship' becomes complex enough, it's actually an entity (Enrollment).
Guideline: If a relationship has its own identity, lifecycle, or rich attribute set, model it as an entity.
Misunderstanding 2: Data vs. Metadata Confusion
Problem: Modeling data values as schema elements.
Example: Creating separate entities for each product category instead of a Category entity with category instances as data.
WRONG: CORRECT:
[Electronics] [Category]
[Clothing] - category_name
[Furniture] - [instances: Electronics, Clothing, ...]
Misunderstanding 3: Current-State Only Thinking
Problem: Modeling only the current state when business processes involve temporal data.
Example: Employee-Department shows current assignment. What about transfers? Promotions? Historical reporting?
Fix: Identify whether historical tracking is needed. If so, model time explicitly through effective dates or temporal patterns.
| Question to Ask | If Answer is Yes... | Consider... |
|---|---|---|
| Does this relationship have its own attributes? | It might be an entity | Model as associative entity |
| Are there multiple specific values being modeled as entities? | These might be data, not schema | Creating a single entity with value instances |
| Do we need to track history? | Current-state model is insufficient | Adding temporal attributes or versioning pattern |
| Is this entity type or instance? | Specific instances don't belong in schema | Generalizing to the type level |
| Could this attribute have multiple values? | 1NF violation possible | Normalizing to separate entity |
Misunderstanding 4: Premature Physical Thinking
Problem: Making conceptual modeling decisions based on physical implementation concerns.
Example: 'We'll put address in the Customer table for performance' before even establishing the conceptual model.
Fix: Separate conceptual modeling (what?) from physical modeling (how?). Denormalization and optimization are physical decisions that should come after establishing correct conceptual structure.
Misunderstanding 5: Ignoring Business Rules
Problem: Drawing entities and relationships without encoding business constraints.
Example: The model shows Employee-Project but doesn't capture that 'a project must have at least one employee' or 'managers cannot be assigned to more than 3 projects.'
Fix: Business rules are part of the model, even if they can't all be drawn graphically. Capture them in annotations or accompanying documentation.
Don't let the diagramming tool limit your thinking. The tool shows what's easily drawable. Many important constraints (complex check conditions, temporal rules, multi-entity validations) don't have standard graphical representations but must still be captured in the model—through annotations or documentation.
Beyond individual mistakes, some problematic patterns emerge at a larger scale.
Anti-Pattern 1: The God Table
A single entity that tries to do everything—Customer has 200 attributes covering contact info, preferences, billing, shipping history, communication history, demographics, segmentation, marketing consent...
Problems:
Fix: Decompose by cohesive concern. CustomerContact, CustomerPreferences, CustomerBilling as separate entities or 1:1 extensions.
Anti-Pattern 2: The EAV Escape Hatch
Entity-Attribute-Value pattern where extensibility is needed, but applied too broadly:
[CustomerAttribute]
- customer_id
- attribute_name ("email", "phone", "shoe_size")
- attribute_value ("john@email.com", "555-1234", "10")
Problems:
Fix: Use EAV sparingly—for truly dynamic attributes only. Core attributes belong in typed columns.
| Anti-Pattern | Symptom | Consequence | Fix |
|---|---|---|---|
| God Table | 200+ columns; diverse purposes | Maintenance nightmare; NULL pollution | Decompose into cohesive entities |
| EAV Overuse | Generic attribute-value structure for core data | No schema enforcement; query complexity | Limit EAV to truly dynamic extensions |
| Circular Dependencies | A→B→C→A referential paths | Insert/delete complexity; deadlock risk | Re-examine relationships; break cycle |
| Premature Denormalization | Repeated data 'for performance' | Update anomalies; inconsistency | Normalize first; denormalize with metrics to justify |
| Star Schema in OLTP | Dimensional model for transactional system | Insert performance issues; update complexity | Match model to use case (OLTP vs OLAP) |
Anti-Pattern 3: Over-Normalization
Yes, there is such a thing. Breaking entities into too many pieces creates:
Example: Normalizing 'FullName' into FirstName, MiddleName, LastName, Prefix, Suffix as separate entities with relationships—when they're always accessed together.
Fix: Normalization serves data integrity. If the complexity doesn't improve integrity (no independent update scenarios), it may be over-engineering.
Anti-Pattern 4: The Inheritance Abyss
Deep specialization hierarchies: Person → Employee → SalariedEmployee → Manager → SeniorManager → ExecutiveManager → ...
Problems:
Fix: Limit hierarchy depth. Consider composition over inheritance. Question whether deep specialization reflects stable reality.
When a model feels wrong—too complex for its purpose, too rigid for business change, too hard to explain—trust that instinct. The 'smell' often indicates an anti-pattern. Stepping back and questioning assumptions often reveals a simpler, more correct design.
Beyond modeling decisions, mistakes in the modeling process lead to poor outcomes.
Process Mistake 1: Modeling in Isolation
Problem: The data modeler works alone, presenting a 'finished' model to stakeholders.
Consequences: Misses domain knowledge. Stakeholders don't understand or buy into the model. Errors discovered late.
Fix: Collaborative modeling. Work with domain experts, review incrementally, validate continuously.
Process Mistake 2: Skipping Requirements Validation
Problem: Building the model before confirming what data is actually needed.
Consequences: Missing entities for overlooked use cases; extra entities for abandoned features; mismatch with application needs.
Fix: Start from use cases and queries. 'What questions will the system answer?' drives entity identification.
Process Mistake 3: No Iteration
Problem: Treating the first draft as final.
Consequences: Early mistakes hardened before they're recognized.
Fix: Plan for multiple design iterations. First draft → Review → Revise → Prototype → Revise → Finalize.
Be wary of design shortcuts justified by 'just this once' or 'we'll fix it later.' Technical debt accumulates. If you recognize you're taking a shortcut, at minimum document it explicitly with a planned resolution. 'Temporary' workarounds have a way of becoming permanent.
Recognizing common mistakes is a form of pattern recognition. Once you've seen these anti-patterns described, you'll spot them in your own work and in models you inherit. This recognition is the first step toward prevention.
What's Next
The final page of this module provides a Review Checklist—a systematic approach to validating ER diagrams before they progress to implementation. The checklist synthesizes everything we've learned into a practical review tool.
You now have a catalog of common ER modeling mistakes across multiple dimensions. This knowledge serves as a defensive tool—a set of warning signs to watch for in your own modeling work and when reviewing others' designs.