Loading learning content...
Requirements arrive in messy forms: interview transcripts, spreadsheets, legacy system exports, whiteboard photos, and conflicting stakeholder opinions. Your task as a database designer is to transform this chaos into a precise, validated ER diagram that captures the essence of the domain.
This transformation isn't magic—it's a disciplined process that experienced designers follow. The process is iterative, collaborative, and guided by proven heuristics. Whether you're modeling a simple application or a complex enterprise system, the same fundamental steps apply.
This page teaches you the complete ER modeling methodology—from the first stakeholder interview to the final validated diagram. You'll learn not just what to do, but how to think about the modeling task.
By the end of this page, you will master a systematic process for ER modeling: gathering requirements, identifying entities through linguistic analysis, discovering relationships, assigning attributes, specifying constraints, iterating through refinement cycles, and validating with stakeholders.
Before drawing a single shape, you must understand the domain. ER modeling is fundamentally a communication exercise—you're capturing someone else's mental model of their data. The better you understand their world, the more accurate your model will be.
Critical interviewing questions:
Effective requirements gathering uses specific question types:
Definitional questions: 'What is a [term]? How would you define it?'
Relationship questions: 'How does [A] relate to [B]?'
Cardinality questions: 'How many [B] can one [A] have?'
Existence questions: 'Can [A] exist without [B]?'
Identity questions: 'How do you uniquely identify a [A]?'
After gathering requirements, try explaining the domain to a 'rubber duck' (or uninvolved colleague). If you can't clearly explain what entities exist, how they relate, and what rules govern them, you need more requirements gathering. Your ER diagram should be a visual representation of this explanation.
With requirements gathered, begin extracting entities—the major 'things' the system will track. The primary technique is linguistic analysis: systematically processing requirement statements to identify entity candidates.
| Heuristic | Description | Example |
|---|---|---|
| Noun extraction | Nouns and noun phrases often represent entities | 'Customers place orders for products' → Customer, Order, Product |
| Singular vs. plural | Convert plurals to singular—entities represent concepts, not collections | 'Products' → Product entity |
| Instance test | Ask: 'Can I have multiple instances of this? Does each have identity?' | 'Address' has multiple instances; 'System' usually doesn't |
| Independent existence | Ask: 'Does this exist independently or only in context of something else?' | 'LineItem' exists only within Order context |
| Stored vs. derived | Focus on stored data. 'Total' derived from line items isn't an entity | 'Customer' is stored; 'CustomerCount' is derived, not an entity |
| Domain relevance | Include only entities within project scope | HR system doesn't need Product entity from Sales |
Worked example—E-commerce requirements analysis:
Requirement statement: "Customers browse products organized into categories and subcategories. When ready, customers add products to a shopping cart and proceed to checkout. Each order is shipped to a delivery address and may be paid with credit cards stored on the customer's account. Customers can write reviews for products they have purchased."
Step 1—Extract nouns: Customers, products, categories, subcategories, shopping cart, checkout, order, delivery address, credit cards, account, reviews
Step 2—Apply entity tests:
| Noun | Multiple instances? | Independent? | Stored? | → Entity? |
|---|---|---|---|---|
| Customer | Yes | Yes | Yes | ✓ Entity |
| Product | Yes | Yes | Yes | ✓ Entity |
| Category | Yes | Yes | Yes | ✓ Entity |
| Subcategory | Yes | Depends on parent | Yes | ✓ Entity (possibly weak) |
| Shopping cart | Yes | Tied to customer | Maybe | ? (often session, not persisted) |
| Checkout | No—a process | No | No | ✗ Not an entity |
| Order | Yes | Yes | Yes | ✓ Entity |
| Delivery address | Yes | Tied to order/customer | Yes | ✓ Entity or attribute |
| Credit card | Yes | Tied to customer | Yes | ✓ Entity |
| Account | One per customer | Tied to customer | Maybe | ? (might merge with Customer) |
| Review | Yes | Yes | Yes | ✓ Entity |
Step 3—Refine entity list:
Don't confuse processes with entities ('Checkout', 'Login', 'Search'). Don't miss junction entities for M:N relationships. Don't over-abstract (creating 'Item' instead of distinct Product, Service types). Don't under-abstract (creating Customer, PremiumCustomer separately when inheritance applies). Validate each entity decision with domain experts.
With entities identified, discover how they connect. Relationship discovery uses verb analysis and semantic reasoning to uncover associations between entities.
Systematically determining cardinality:
For each candidate relationship, answer these two questions:
The answers determine cardinality:
| A→B | B→A | Cardinality |
|---|---|---|
| One | One | 1:1 |
| One | Many | 1:N (from B's perspective) or N:1 (from A's) |
| Many | One | N:1 (from B's) or 1:N (from A's) |
| Many | Many | M:N |
E-commerce relationship analysis:
| Relationship | A→B | B→A | Cardinality |
|---|---|---|---|
| Customer PLACES Order | One customer places many orders | One order placed by one customer | 1:N |
| Order CONTAINS Product | One order contains many products | One product appears in many orders | M:N |
| Product BELONGS_TO Category | One product belongs to one category | One category contains many products | N:1 |
| Customer WRITES Review | One customer writes many reviews | One review written by one customer | 1:N |
| Review IS_FOR Product | One review is for one product | One product has many reviews | N:1 |
M:N relationships often indicate a missing junction entity. 'Order CONTAINS Product' (M:N) suggests OrderLine as an entity with its own attributes (Quantity, UnitPrice). When you find M:N, ask: 'Does this relationship have attributes? Is there information about the association itself?' If yes, create an explicit junction entity.
Determining participation constraints:
For each relationship, assess whether participation is mandatory or optional:
Mandatory (total) participation indicators:
Optional (partial) participation indicators:
Example analysis:
| Relationship | Customer side | Order side |
|---|---|---|
| Customer PLACES Order | Optional (customer may exist without orders) | Mandatory (every order must have a customer) |
Note: This captures the business rule that you can have registered customers who haven't yet ordered, but you cannot have orphan orders.
With entities and relationships defined, assign attributes to each. Attributes describe the properties we need to store about each entity (and sometimes about relationships).
Important attribute decisions:
1. Identifying the key attribute(s):
Every entity needs one or more attributes that uniquely identify instances. Questions to ask:
2. Simple vs. composite:
Should Address be one attribute or decomposed into Street, City, State, PostalCode, Country?
3. Derived vs. stored:
Age can be computed from DateOfBirth. OrderTotal can be computed from line items.
4. Multivalued identification:
Can an entity have multiple values of an attribute? Customer may have multiple PhoneNumbers, multiple EmailAddresses.
| Entity | Key Attribute | Other Attributes | Notes |
|---|---|---|---|
| Customer | CustomerID | Name, Email, Phone*, RegisterDate |
|
| Product | ProductID | Name, Description, Price, SKU, Weight | Price may need history |
| Category | CategoryID | Name, Description, ParentCategory | ParentCategory enables hierarchy |
| Order | OrderID | OrderDate, Status, ShippingAddress, TotalAmount† | † TotalAmount is derived |
| OrderLine | OrderID + LineNumber | Quantity, UnitPrice | Weak entity of Order |
| Review | ReviewID | Rating, Text, ReviewDate | Also links to Customer, Product |
Relationships can have attributes too. In the ENROLLMENT relationship between Student and Course, attributes like Grade and EnrollmentDate describe the association, not either participant. These attributes appear on relationship diamonds in Chen notation or migrate to junction entities in crow's foot.
Beyond cardinality and participation, many business rules translate into constraints that the ER model should capture. These constraints ensure data integrity reflects real-world rules.
Capturing constraints in ER diagrams:
Standard ER notation directly expresses some constraints (cardinality, participation, keys) but not others (domain constraints, complex business rules). For constraints beyond notation's expressiveness:
Example constraint documentation:
Entity: Order
- OrderDate must not be in the future
- Status must be one of: Draft, Pending, Confirmed, Processing, Shipped, Delivered, Cancelled
- If Status = 'Shipped', ShippedDate must be non-null
- If Status = 'Delivered', DeliveredDate must be non-null and ≥ ShippedDate
Relationship: Order CONTAINS OrderLine
- Minimum cardinality on OrderLine side = 1 (order must have at least one line)
- If Order.Status ≠ 'Draft', OrderLines cannot be modified
Relationship: Customer PLACES Order
- A Customer cannot have more than 10 orders with Status = 'Pending' simultaneously
Many constraints cannot be captured in diagrams alone. Don't lose them! Maintain a parallel constraint specification document that accompanies your ER diagram. These constraints will drive CHECK constraints, triggers, and application logic in the physical implementation.
With all conceptual elements defined, construct the ER diagram. Diagram construction is partly mechanical (applying notation rules) and partly artistic (creating a readable, well-organized visual).
Layered construction approach:
Layer 1 – Entities only:
Layer 2 – Relationships:
Layer 3 – Cardinality and participation:
Layer 4 – Attributes:
Layer 5 – Refinement:
Tools for ER diagram construction:
Large models become unreadable in a single diagram. Split into subject areas (Customer domain, Inventory domain, Fulfillment domain). Create an overview diagram showing entity boxes only (no attributes) with relationships, then detailed diagrams for each subject area. Always ensure navigability between diagrams.
No ER diagram is correct on the first attempt. Validation catches errors before they become expensive implementation problems. Iteration incorporates feedback to refine the model.
Query validation—the ultimate test:
The most powerful validation technique is to test queries against the model. For each important question the system must answer, trace how you would retrieve the data:
Example query validation for e-commerce:
Query: 'Show all orders placed by customers who have written at least 5 reviews.'
Tracing:
✓ Model supports this query: Customer connects to both Review and Order.
Query: 'Show products that were both purchased and reviewed by the same customer.'
Tracing:
✓ Model supports this query: Both paths from Customer to Product exist.
If any critical query cannot be traced through the model, something is missing—an entity, relationship, or attribute needs to be added.
Professional database designers expect 3-5 major iterations of an ER diagram before it stabilizes. Each stakeholder review, each query validation, each new requirement reveals refinements. Budget time for iteration—it's not wasted effort, it's where quality comes from.
Common iteration triggers:
You now have a complete methodology for creating ER diagrams from requirements. This systematic process transforms messy domain knowledge into precise conceptual models.
What's Next:
With the modeling process mastered, we'll conclude this module with an exploration of the Benefits of ER Modeling. You'll understand why this methodology has endured for nearly 50 years and how it delivers value across the database design lifecycle.
You now possess a complete, professional-grade methodology for ER modeling. From requirements gathering through validation, you know what to do at each step and how to think about the modeling task. This process, applied consistently, produces accurate, complete, and validated conceptual models that serve as the foundation for successful database implementations.