Conceptual Design - Learning Module

Loading content...

0/252

Initial Schema: Assembling the Complete Conceptual Model

From Pieces to Picture

Throughout this module, we've developed the individual skills of conceptual design—understanding high-level modeling, constructing ER diagrams, identifying entities, and discovering relationships. Now it's time to bring these skills together to create an initial schema: a complete, coherent conceptual model ready for validation and refinement.

The initial schema is not a final artifact frozen in time. It's the first complete draft—good enough to review with stakeholders, detailed enough to identify gaps, and structured enough to transition into logical design. Creating this schema requires synthesis: combining entities, relationships, attributes, and constraints into a unified whole.

This page guides you through the assembly process, provides validation techniques, discusses documentation practices, and prepares you for the next phase of database design.

What You Will Learn

By the end of this page, you will be able to assemble a complete initial schema from your conceptual design work, validate the schema for completeness and consistency, document decisions and assumptions, and prepare the model for transition to logical design. You'll understand what makes a schema 'good enough' for review and how to iterate based on feedback.

What is an Initial Schema?

An initial schema is the first complete version of your conceptual data model. It integrates all the pieces you've developed:

All identified entities with their attributes
All discovered relationships with cardinality and participation
Key constraints (primary keys, unique constraints)
Known business rules and domain constraints
Any specializations or advanced constructs (EER elements)

Initial ≠ Final

The word "initial" is deliberate. This schema will evolve:

Stakeholder review will reveal misunderstandings
Normalization during logical design may split entities
Physical design considerations may suggest changes
Requirements themselves may change

The goal isn't perfection—it's a solid foundation for iteration.

What Makes a Schema 'Complete Enough'

An initial schema should be:

Comprehensive: Covers all identified requirements; no known gaps
Consistent: No contradictory elements or naming conflicts
Documentable: Decisions can be explained and justified
Reviewable: Clear enough for stakeholders to understand and critique
Evolvable: Structured to accommodate likely changes

It doesn't need to be:

Implementation-ready (that's logical/physical design)
Performance-optimized (that's premature at this stage)
Tooling-specific (notation should be transferable)

Initial Schema Checklist
Element	Required for Initial Schema	Can Wait for Later
All entities identified	✓ Yes	—
All relationships mapped	✓ Yes	—
Primary keys defined	✓ Yes	—
Cardinality specified	✓ Yes	—
Participation constraints	✓ Yes	—
Entity attributes listed	✓ Yes	—
Attribute data types	Optional	✓ Logical design
Foreign key details	Optional	✓ Logical design
Index design	No	✓ Physical design
Storage parameters	No	✓ Physical design

Aim for 80% Confidence

The initial schema should represent your best current understanding. If you're 80% confident it's correct, that's good enough to enter review. The remaining 20% will surface during validation. Waiting for 100% confidence means waiting forever—requirements change, understanding deepens, and perfection is impossible.

Assembling the Initial Schema

Schema assembly brings together all your conceptual design work. Follow this systematic process:

Step 1: Consolidate Entity Inventory

Create a master list of all entities:

Collect all entity candidates from your discovery work
Eliminate duplicates (same concept, different names)
Resolve naming inconsistencies (choose canonical names)
Classify entities: strong, weak, or associative (junction)
List each entity with its primary key attribute(s)

Result: A definitive list like:

Customer (CustomerID)
Order (OrderNumber)
Product (ProductSKU)
OrderLine (OrderNumber + LineNumber) — weak entity

Step 2: Complete Attribute Specification

For each entity, list all attributes:

Gather attributes from discovery notes
Identify key attributes (underline in diagram)
Mark composite attributes with their components
Mark multivalued attributes
Mark derived attributes
Note any domain constraints (e.g., Status must be one of: Active, Inactive, Pending)

At this stage, you may note general types (text, number, date) but formal data types wait for logical design.

Schema Assembly Checklist

•Entity Inventory — All entities listed with canonical names and keys
•Attribute Completeness — Every entity has its attributes; key attributes marked
•Relationship Inventory — All relationships named with both connected entities
•Cardinality Specification — 1:1, 1:N, M:N specified for every relationship
•Participation Constraints — Total/partial marked for each entity in each relationship
•Relationship Attributes — Attributes belonging to relationships identified
•Weak Entity Identification — Weak entities and their identifying relationships marked
•Specializations — Any EER hierarchies documented with constraints (disjoint/overlapping, total/partial)

Step 3: Map All Relationships

Consolidate your relationship discoveries:

List every relationship with its participating entities
Specify cardinality from both sides
Mark participation (total or partial) for each entity
Document relationship attributes if any
Note any constraints on relationships

Result should look like:

Customer (partial) —places(0,N)→ Order (total): Customer may have zero or more orders; each Order must have exactly one Customer
Order —contains— OrderLine: identifying relationship (Order is owner, OrderLine is weak)

Step 4: Draw the Complete Diagram

With all components identified, create the full ER diagram:

Place entities on the diagram with meaningful spatial organization
Add relationships connecting entities
Annotate cardinality and participation
Add attribute ovals (or list inside entities for compact notation)
Mark special constructs (weak entities, specializations)
Ensure no components are orphaned

Step 5: Add Constraint Documentation

Capture constraints that don't fit diagram notation:

Business rules: "An order can only be shipped after payment is confirmed"
Derived calculations: "OrderTotal = SUM of (LineQuantity × LinePrice)"
Cross-entity constraints: "Manager must be an employee of the same department"
Temporal constraints: "EndDate must be >= StartDate"

These go in supplementary documentation alongside the diagram.

Diagram Organization Matters

A well-organized diagram communicates better. Group related entities spatially. Place central entities in the middle. Keep relationship lines as short and uncrossed as possible. Use consistent notation throughout. A messy diagram obscures understanding even if technically correct.

Validating the Initial Schema

Before presenting the schema for review, perform systematic validation to catch issues you can identify yourself.

Validation 1: Requirements Traceability

Every requirement should map to the schema:

List all data requirements from your requirements document
For each requirement, identify which entity/attribute/relationship supports it
Flag requirements with no clear schema support → model gaps
Flag schema elements with no clear requirement → potential over-engineering

Example traceability:

Requirement	Schema Element	Status
Track customer names	Customer.Name	✓ Covered
Store order history	Order entity + Customer relationship	✓ Covered
Calculate monthly revenue	Derived from Order.Total	❓ Verify derivable
Track product reviews	❌ Missing	Gap identified

Validation 2: Query Walkthrough

For key queries stakeholders need, verify the model can answer them:

List expected queries (from interviews, reports, dashboards)
For each query, trace a path through the model
Verify needed attributes exist along the path
Ensure cardinalities allow the expected results

Example:

Query: "List all orders for a customer with product details"
Path: Customer → Order → OrderLine → Product
Verification: Path exists with appropriate cardinalities ✓

Internal Consistency Checks

•Every entity has a primary key
•Every relationship connects existing entities
•Cardinality is specified on both sides
•Weak entities have identifying relationships
•No duplicate entity or relationship names
•Specializations are properly constrained
•Relationship attributes belong to relationships, not entities

Domain Accuracy Checks

•Entity names match domain vocabulary
•Cardinalities reflect business reality
•Participation constraints match requirements
•Business rules are captured or documented
•Temporal aspects handled appropriately
•No implementation details present
•Model can represent all valid real-world states

Validation 3: Instance Testing

Create sample instances to verify the model handles real data:

Invent a realistic scenario with multiple entity instances
Draw or list instance values
Verify relationships can be established as expected
Check cardinality constraints hold
Ensure no valid scenarios are blocked

Example:

Customer C1 places Orders O1, O2
Order O1 contains Products P1, P2
Order O2 contains Product P1, P3
Does the model allow this? (Should be yes for typical order system)

Validation 4: Edge Case Analysis

Consider edge cases and boundary conditions:

New customer with no orders yet → model allows (partial participation)?
Order with zero line items → should model allow? (probably no—total participation)
Employee with no manager (CEO) → model allows?
Product in no orders → allowed (partial participation)?

Edge cases often reveal participation constraint issues.

Validation 5: Semantic Accuracy

Verify the model captures meaning, not just structure:

Can the model distinguish valid from invalid states?
Are there states the model allows that shouldn't be possible?
Are there valid states the model cannot represent?

Example: If an order can only have one shipping address, but the model shows M:N between Order and Address, invalid states are possible.

Find Problems Before Stakeholders Do

Self-validation before review builds credibility. If stakeholders find basic issues you should have caught—missing obvious entities, broken query paths, wrong cardinalities—they lose confidence in the model. Thorough self-validation elevates review discussions to genuine domain questions.

Stakeholder Review Process

The initial schema must be validated by stakeholders who understand the domain. They'll catch issues you can't—because you don't know the business as deeply as they do.

Preparing for Review

Create presentation materials: Large, clear diagrams. Avoid dense, technical drawings. Consider separate views for different audiences.
Write a glossary: Define every entity in business terms. "Customer: An individual or organization who has registered to make purchases."
Prepare walkthrough scenarios: Real examples stakeholders can follow. "Let's trace what happens when customer Jane Doe places an order..."
List assumptions and decisions: Document where you made judgment calls. "We assumed employees can only belong to one department. Is this correct?"
Identify open questions: What couldn't you determine independently? "Does a product need to be categorized before it can be sold?"

Running the Review

Effective review sessions:

Start with context: Remind stakeholders of the purpose. "This model represents the data for the order management system."
Walk through, don't lecture: Step through scenarios, asking for confirmation. "So a customer places multiple orders over time—is that right?"
Encourage questions: Silence isn't agreement. Probe: "Does anything look unexpected or missing?"
Focus on semantics, not notation: Non-technical stakeholders may struggle with ER notation. Translate: "This line means one customer can have many orders."

Review Questions to Ask Stakeholders

•"Does this list of entities capture all the things you track?"
•"For [Entity X], are these all the pieces of information you need?"
•"Can a [Entity A] be associated with multiple [Entity B]s, or just one?"
•"Is it possible for a [Entity A] to exist without any [Entity B]?"
•"Are there any business rules we haven't captured here?"
•"Looking at this diagram, can you see anything missing that you regularly work with?"
•"If we built the system exactly as shown, what wouldn't work?"

Handling Feedback

Review feedback falls into categories:

Corrections: The model is wrong.

"Actually, customers can have multiple shipping addresses."
Action: Fix the model.

Additions: Something is missing.

"We also need to track promotional codes on orders."
Action: Add to model if in scope; clarify scope if not.

Clarifications: The model is unclear.

"I don't understand what 'OrderLine' means."
Action: Improve naming or documentation.

Scope Questions: Uncertainty about whether something belongs.

"Should we include supplier information too?"
Action: Discuss with project stakeholders to confirm scope.

Future Requirements: Things that will be needed but aren't in current scope.

"Eventually, we'll need subscription handling."
Action: Note for future; don't add to current scope unless directed.

Iterating After Review

Review typically requires model updates:

Incorporate corrections immediately
Add confirmed in-scope additions
Update documentation for clarifications
Re-validate affected areas
Schedule follow-up review if changes are substantial

Most conceptual designs go through 2-3 review iterations before stabilizing.

Different Stakeholders, Different Views

Don't show the entire complex model to everyone. Operations staff care about transactions; they don't need to see HR entities. Create focused views for different audiences. The full model exists, but reviews use subsets relevant to each stakeholder group.

Documenting the Initial Schema

A diagram alone doesn't constitute complete documentation. The initial schema should be documented thoroughly for future reference, team communication, and design continuity.

Essential Documentation Components

1. Entity Catalog

For each entity, document:

Name: Canonical entity name
Description: What this entity represents in business terms
Primary Key: Identifying attribute(s)
Attributes: Full list with descriptions
Constraints: Any entity-level rules
Notes: Decisions, assumptions, gotchas

Example:

Entity: Customer
Description: An individual or organization registered to make purchases
Primary Key: CustomerID
Attributes:
  - CustomerID: Unique identifier for the customer
  - Name: Full name of individual or organization name
  - Email: Primary contact email (unique)
  - RegistrationDate: Date customer account created
  - Status: Active, Inactive, or Suspended
Constraints:
  - Email must be unique across customers
  - Status must be one of the defined values
Notes: We treat individuals and organizations as the same entity type.
       Consider splitting if different attributes are needed.

2. Relationship Catalog

For each relationship, document:

Name: Relationship name
Entities: Participating entities with roles
Cardinality: From each entity's perspective
Participation: Mandatory or optional for each entity
Attributes: Any relationship attributes
Description: What this relationship means

Documentation Component Summary
Component	Purpose	Audience	Format
ER Diagram	Visual overview of model	All stakeholders	Drawing/diagram tool export
Entity Catalog	Detailed entity definitions	Designers, developers	Structured document/wiki
Relationship Catalog	Relationship specifications	Designers, developers	Structured document/wiki
Glossary	Term definitions	All stakeholders	Alphabetical list
Assumptions Log	Documented decisions	Future maintainers	Dated log entries
Requirements Traceability	Requirements to model mapping	Project managers, QA	Matrix/table
Open Issues	Unresolved questions	Project team	Issue tracking system

3. Glossary

Define all terms used in the model:

Entity names
Attribute names
Relationship names
Domain-specific terminology

A glossary ensures everyone interprets terms consistently. When a new team member asks "What's an SKU?", the glossary answers.

4. Assumptions and Decisions Log

Record the reasoning behind decisions:

Date: 2024-01-15
Decision: Customer and Organization are the same entity
Rationale: Current requirements don't distinguish them;
           attributes are identical; simplifies model.
Revisit if: Business needs different handling for B2B vs B2C.
           Consider generalization/specialization.

This log prevents revisiting settled issues and explains why the model looks the way it does.

5. Known Limitations

Document what the model doesn't do:

"This model tracks current state only; historical employee-department assignments are not tracked."
"Product variants (sizes, colors) are not modeled; each variant is a separate Product."
"Multi-currency pricing is out of scope."

Clear limitations prevent incorrect expectations.

Documentation Practices

Keep it current: Outdated documentation is worse than none—it misleads.
Store with the design: Documentation should be versioned alongside the model.
Use accessible formats: Wikis, markdown files, or collaborative docs that team members can access.
Link to source: Reference requirements documents, interview notes, and review meeting notes.

Documentation Enables Handoff

Good documentation means the conceptual design can be understood by someone who wasn't present for its creation. This is crucial for team scaling, personnel changes, and long-term maintenance. The diagram shows what; the documentation explains why.

Quality Criteria for Initial Schemas

How do you know if your initial schema is good? Beyond basic correctness, quality schemas exhibit several characteristics that distinguish professional work from amateur attempts.

Criterion 1: Semantic Richness

The model captures meaning, not just structure:

Relationship names convey domain semantics ("purchases" vs. generic "has")
Cardinalities reflect actual business rules, not defaults
Constraints capture domain restrictions
The model distinguishes valid from invalid real-world states

A semantically rich model can be read as a description of the domain. Someone unfamiliar with the business should understand it from the model.

Criterion 2: Minimal Redundancy

Each fact is represented once:

No attribute appears in multiple entities unless it's a foreign key
No relationship path is duplicated
Derived data is marked as derived, not stored

Redundancy in conceptual models leads to redundancy in databases, causing update anomalies.

Criterion 3: Appropriate Abstraction

The model operates at the right level:

High enough to be technology-independent
Low enough to capture essential detail
No implementation artifacts (foreign key columns, index hints)
No application artifacts (screen fields, button actions)

Quality Evaluation Rubric

•Completeness (10 pts): All requirements traceable to model elements; no known gaps
•Accuracy (10 pts): Cardinalities, participations, and constraints match domain reality
•Semantic Richness (10 pts): Names are meaningful; model conveys domain understanding
•Minimal Redundancy (10 pts): No duplicate facts; derived data marked appropriately
•Abstraction Level (10 pts): Technology-independent; no implementation details
•Consistency (10 pts): Naming conventions followed; no contradictions
•Documentation (10 pts): Entities, relationships, assumptions documented
•Stakeholder Validation (10 pts): Reviewed and accepted by domain experts
•Evolvability (10 pts): Structure accommodates likely future changes
•Visual Clarity (10 pts): Diagram is readable, organized, and annotated

Criterion 4: Consistency

The model is internally coherent:

Naming conventions are uniform (all singular, all PascalCase, etc.)
No contradictory constraints (A requires B, but B requires A)
Notation is consistent throughout
Similar concepts are modeled similarly

Criterion 5: Evolvability

The model can grow:

Adding new entities doesn't require restructuring existing ones
Common extension points are designed in (e.g., status codes as entities allow adding new statuses)
Over-specialized structures are avoided where generalization may be needed

Criterion 6: Visual Clarity

The diagram communicates effectively:

Related entities are grouped spatially
Relationship lines don't cross unnecessarily
Notation is legible at review-presentation size
Colors or groupings (if used) convey information

Self-Assessment

Before presenting for review, rate your schema against these criteria:

Below 70 points: More work needed
70-85 points: Ready for review with known issues
85+ points: High confidence, ready for formal review

Quality Over Speed

Time invested in conceptual design quality pays off many times over. A high-quality initial schema accelerates logical design, reduces implementation rework, and prevents production data problems. Never rush conceptual design to start coding sooner—the technical debt will exceed any time 'saved.'

Preparing for Logical Design

The initial schema will be transformed into a logical schema—typically a relational schema ready for implementation. Preparing for this transition ensures smooth continuity.

What Logical Design Will Do

Logical design transforms the conceptual model:

Entities → Tables: Each entity becomes one or more tables
Attributes → Columns: With specific data types
Relationships → Foreign Keys/Junction Tables: Implementing cardinality
Multivalued Attributes → Separate Tables: Normalization
Composite Attributes → Columns or Decomposition: Design choice
Specializations → Table Strategies: Single table, multiple tables, or combined

Information to Provide for Logical Design

To facilitate transition, ensure your conceptual design provides:

Clear Primary Keys: Every entity needs a clear identifier. Natural keys are preferred at conceptual level; surrogates may be added during logical design.
Cardinality and Participation: These determine foreign key placement and NULL constraints.
Weak Entity Indicators: Weak entities become tables with composite keys.
Relationship Attributes: These move to junction tables for M:N relationships.
Specialization Constraints: Disjoint/overlapping and total/partial guide table strategy.
Domain Constraints: Valid values for attributes guide CHECK constraints and lookup tables.

Conceptual to Logical Mapping Preview
Conceptual Element	Logical Representation	Considerations
Strong Entity	Table with primary key	Straightforward mapping
Weak Entity	Table with composite PK including owner's key	Foreign key to owner with CASCADE DELETE
1:1 Relationship	Foreign key in one table (usually total participation side)	Choose which side holds FK
1:N Relationship	Foreign key in 'N' side table	Standard pattern
M:N Relationship	Junction table with two foreign keys	Junction table may have additional columns for relationship attributes
Multivalued Attribute	Separate table with FK to owner	One row per value
Composite Attribute	Multiple columns or flattened	Design choice based on usage
Specialization	Single table, separate tables, or combined	Depends on overlap and coverage

Anticipating Normalization

Logical design includes normalization—ensuring the design avoids update anomalies. While normalization is a logical design activity, you can anticipate issues:

Repeated groups of attributes: May indicate entity missing (normalize to separate table)
Transitive dependencies: Attribute depends on non-key attribute (may need decomposition)
Composite attributes stored as single value: May complicate querying

A well-constructed conceptual model rarely needs significant restructuring during normalization—these issues are usually avoided by good entity identification.

Handoff Checklist

Before transitioning to logical design, verify:

All entities have primary keys identified
All relationships have cardinality specified
All relationships have participation constraints
Weak entities are marked with identifying relationships
Multivalued attributes are identified
Composite attributes have components listed
Specializations have constraints (disjoint/overlap, total/partial)
Business rules requiring CHECK constraints are documented
Any open questions are documented for resolution during logical design

Transition Meeting

If different people do conceptual and logical design, hold a handoff meeting:

Walk through the complete model
Explain assumptions and decisions
Highlight any areas of uncertainty
Identify open issues requiring resolution
Agree on documentation access and version control

Conceptual Design Is Not Wasted

Some worry that conceptual design is 'extra work' that logical design duplicates. It isn't. Conceptual design enables faster, more accurate logical design. It provides reviewed, validated requirements mapping. It documents decisions. It creates shared understanding. Far from duplicating effort, conceptual design reduces total project effort.

Summary: Conceptual Design Mastery

We've completed the conceptual design journey—from understanding high-level modeling principles through creating a validated initial schema. Let's consolidate the full module learning:

Module Key Takeaways

•High-level modeling captures data semantics — Technology-independent representation focusing on meaning, not implementation.
•ER diagrams are the primary visual tool — Entities, relationships, and attributes in a notation understandable to technical and business stakeholders.
•Entity identification requires systematic techniques — Noun analysis, form mining, process analysis; distinguishing entities from attributes.
•Relationship specification is critical — Cardinality, participation, naming; special types like recursive and ternary relationships.
•Initial schemas bring it together — Complete, documented, validated models ready for stakeholder review.
•Validation is multi-layered — Requirements traceability, query walkthroughs, instance testing, edge case analysis.
•Documentation enables continuity — Entity and relationship catalogs, glossaries, assumptions logs.
•Quality criteria guide assessment — Completeness, accuracy, semantic richness, minimal redundancy, readiness for logical design.

The Conceptual Design Discipline

Conceptual design is both an art and a discipline. The art lies in understanding domains, recognizing patterns, and making judgment calls when information is ambiguous. The discipline lies in systematic techniques, thorough validation, and rigorous documentation.

The best database designers combine both—creative insight guided by methodical process. They produce models that are accurate, complete, and elegant; models that stakeholders recognize as capturing their domain; models that serve as solid foundations for years of system evolution.

What's Next in the Database Design Journey

With conceptual design complete, the next phase is logical design—transforming the conceptual model into a schema for a specific data model paradigm (typically relational). Logical design includes:

Mapping ER elements to relational tables
Applying normalization to eliminate redundancy
Specifying integrity constraints
Refining the design for specific database capabilities

The conceptual model you've created makes logical design faster and more accurate. Instead of guessing at table structures, the logical designer transforms a validated, documented conceptual design systematically.

Congratulations on Completing This Module

You now possess comprehensive skills in conceptual database design. You can analyze domains, construct ER diagrams, identify entities and relationships, create initial schemas, and validate your work to professional standards. These skills form the foundation of all database design work.

Module Complete

Congratulations! You've completed Module 3: Conceptual Design. You've mastered high-level modeling principles, ER diagram construction, entity and relationship identification, and created validated initial schemas. You're now prepared to move forward with logical and physical database design, building upon this solid conceptual foundation.

Initial Schema: Assembling the Complete Conceptual Model

From Pieces to Picture

This page guides you through the assembly process, provides validation techniques, discusses documentation practices, and prepares you for the next phase of database design.

What You Will Learn

What is an Initial Schema?

An initial schema is the first complete version of your conceptual data model. It integrates all the pieces you've developed:

All identified entities with their attributes
All discovered relationships with cardinality and participation
Key constraints (primary keys, unique constraints)
Known business rules and domain constraints
Any specializations or advanced constructs (EER elements)

Initial ≠ Final

The word "initial" is deliberate. This schema will evolve:

Stakeholder review will reveal misunderstandings
Normalization during logical design may split entities
Physical design considerations may suggest changes
Requirements themselves may change

The goal isn't perfection—it's a solid foundation for iteration.

What Makes a Schema 'Complete Enough'

An initial schema should be:

Comprehensive: Covers all identified requirements; no known gaps
Consistent: No contradictory elements or naming conflicts
Documentable: Decisions can be explained and justified
Reviewable: Clear enough for stakeholders to understand and critique
Evolvable: Structured to accommodate likely changes

It doesn't need to be:

Implementation-ready (that's logical/physical design)
Performance-optimized (that's premature at this stage)
Tooling-specific (notation should be transferable)

Initial Schema Checklist
Element	Required for Initial Schema	Can Wait for Later
All entities identified	✓ Yes	—
All relationships mapped	✓ Yes	—
Primary keys defined	✓ Yes	—
Cardinality specified	✓ Yes	—
Participation constraints	✓ Yes	—
Entity attributes listed	✓ Yes	—
Attribute data types	Optional	✓ Logical design
Foreign key details	Optional	✓ Logical design
Index design	No	✓ Physical design
Storage parameters	No	✓ Physical design

Aim for 80% Confidence

Assembling the Initial Schema

Schema assembly brings together all your conceptual design work. Follow this systematic process:

Step 1: Consolidate Entity Inventory

Create a master list of all entities:

Collect all entity candidates from your discovery work
Eliminate duplicates (same concept, different names)
Resolve naming inconsistencies (choose canonical names)
Classify entities: strong, weak, or associative (junction)
List each entity with its primary key attribute(s)

Result: A definitive list like:

Customer (CustomerID)
Order (OrderNumber)
Product (ProductSKU)
OrderLine (OrderNumber + LineNumber) — weak entity

Step 2: Complete Attribute Specification

For each entity, list all attributes:

Gather attributes from discovery notes
Identify key attributes (underline in diagram)
Mark composite attributes with their components
Mark multivalued attributes
Mark derived attributes
Note any domain constraints (e.g., Status must be one of: Active, Inactive, Pending)

At this stage, you may note general types (text, number, date) but formal data types wait for logical design.

Schema Assembly Checklist

•Entity Inventory — All entities listed with canonical names and keys
•Attribute Completeness — Every entity has its attributes; key attributes marked
•Relationship Inventory — All relationships named with both connected entities
•Cardinality Specification — 1:1, 1:N, M:N specified for every relationship
•Participation Constraints — Total/partial marked for each entity in each relationship
•Relationship Attributes — Attributes belonging to relationships identified
•Weak Entity Identification — Weak entities and their identifying relationships marked
•Specializations — Any EER hierarchies documented with constraints (disjoint/overlapping, total/partial)

Step 3: Map All Relationships

Consolidate your relationship discoveries:

List every relationship with its participating entities
Specify cardinality from both sides
Mark participation (total or partial) for each entity
Document relationship attributes if any
Note any constraints on relationships

Result should look like:

Customer (partial) —places(0,N)→ Order (total): Customer may have zero or more orders; each Order must have exactly one Customer
Order —contains— OrderLine: identifying relationship (Order is owner, OrderLine is weak)

Step 4: Draw the Complete Diagram

With all components identified, create the full ER diagram:

Place entities on the diagram with meaningful spatial organization
Add relationships connecting entities
Annotate cardinality and participation
Add attribute ovals (or list inside entities for compact notation)
Mark special constructs (weak entities, specializations)
Ensure no components are orphaned

Step 5: Add Constraint Documentation

Capture constraints that don't fit diagram notation:

Business rules: "An order can only be shipped after payment is confirmed"
Derived calculations: "OrderTotal = SUM of (LineQuantity × LinePrice)"
Cross-entity constraints: "Manager must be an employee of the same department"
Temporal constraints: "EndDate must be >= StartDate"

These go in supplementary documentation alongside the diagram.

Diagram Organization Matters

Validating the Initial Schema

Before presenting the schema for review, perform systematic validation to catch issues you can identify yourself.

Validation 1: Requirements Traceability

Every requirement should map to the schema:

List all data requirements from your requirements document
For each requirement, identify which entity/attribute/relationship supports it
Flag requirements with no clear schema support → model gaps
Flag schema elements with no clear requirement → potential over-engineering

Example traceability:

Requirement	Schema Element	Status
Track customer names	Customer.Name	✓ Covered
Store order history	Order entity + Customer relationship	✓ Covered
Calculate monthly revenue	Derived from Order.Total	❓ Verify derivable
Track product reviews	❌ Missing	Gap identified

Validation 2: Query Walkthrough

For key queries stakeholders need, verify the model can answer them:

List expected queries (from interviews, reports, dashboards)
For each query, trace a path through the model
Verify needed attributes exist along the path
Ensure cardinalities allow the expected results

Example:

Query: "List all orders for a customer with product details"
Path: Customer → Order → OrderLine → Product
Verification: Path exists with appropriate cardinalities ✓

Internal Consistency Checks

•Every entity has a primary key
•Every relationship connects existing entities
•Cardinality is specified on both sides
•Weak entities have identifying relationships
•No duplicate entity or relationship names
•Specializations are properly constrained
•Relationship attributes belong to relationships, not entities

Domain Accuracy Checks

•Entity names match domain vocabulary
•Cardinalities reflect business reality
•Participation constraints match requirements
•Business rules are captured or documented
•Temporal aspects handled appropriately
•No implementation details present
•Model can represent all valid real-world states

Validation 3: Instance Testing

Create sample instances to verify the model handles real data:

Invent a realistic scenario with multiple entity instances
Draw or list instance values
Verify relationships can be established as expected
Check cardinality constraints hold
Ensure no valid scenarios are blocked

Example:

Customer C1 places Orders O1, O2
Order O1 contains Products P1, P2
Order O2 contains Product P1, P3
Does the model allow this? (Should be yes for typical order system)

Validation 4: Edge Case Analysis

Consider edge cases and boundary conditions:

New customer with no orders yet → model allows (partial participation)?
Order with zero line items → should model allow? (probably no—total participation)
Employee with no manager (CEO) → model allows?
Product in no orders → allowed (partial participation)?

Edge cases often reveal participation constraint issues.

Validation 5: Semantic Accuracy

Verify the model captures meaning, not just structure:

Can the model distinguish valid from invalid states?
Are there states the model allows that shouldn't be possible?
Are there valid states the model cannot represent?

Example: If an order can only have one shipping address, but the model shows M:N between Order and Address, invalid states are possible.

Find Problems Before Stakeholders Do

Stakeholder Review Process

The initial schema must be validated by stakeholders who understand the domain. They'll catch issues you can't—because you don't know the business as deeply as they do.

Preparing for Review

Create presentation materials: Large, clear diagrams. Avoid dense, technical drawings. Consider separate views for different audiences.
Write a glossary: Define every entity in business terms. "Customer: An individual or organization who has registered to make purchases."
Prepare walkthrough scenarios: Real examples stakeholders can follow. "Let's trace what happens when customer Jane Doe places an order..."
List assumptions and decisions: Document where you made judgment calls. "We assumed employees can only belong to one department. Is this correct?"
Identify open questions: What couldn't you determine independently? "Does a product need to be categorized before it can be sold?"

Running the Review

Effective review sessions:

Start with context: Remind stakeholders of the purpose. "This model represents the data for the order management system."
Walk through, don't lecture: Step through scenarios, asking for confirmation. "So a customer places multiple orders over time—is that right?"
Encourage questions: Silence isn't agreement. Probe: "Does anything look unexpected or missing?"
Focus on semantics, not notation: Non-technical stakeholders may struggle with ER notation. Translate: "This line means one customer can have many orders."

Review Questions to Ask Stakeholders

•"Does this list of entities capture all the things you track?"
•"For [Entity X], are these all the pieces of information you need?"
•"Can a [Entity A] be associated with multiple [Entity B]s, or just one?"
•"Is it possible for a [Entity A] to exist without any [Entity B]?"
•"Are there any business rules we haven't captured here?"
•"Looking at this diagram, can you see anything missing that you regularly work with?"
•"If we built the system exactly as shown, what wouldn't work?"

Handling Feedback

Review feedback falls into categories:

Corrections: The model is wrong.

"Actually, customers can have multiple shipping addresses."
Action: Fix the model.

Additions: Something is missing.

"We also need to track promotional codes on orders."
Action: Add to model if in scope; clarify scope if not.

Clarifications: The model is unclear.

"I don't understand what 'OrderLine' means."
Action: Improve naming or documentation.

Scope Questions: Uncertainty about whether something belongs.

"Should we include supplier information too?"
Action: Discuss with project stakeholders to confirm scope.

Future Requirements: Things that will be needed but aren't in current scope.

"Eventually, we'll need subscription handling."
Action: Note for future; don't add to current scope unless directed.

Iterating After Review

Review typically requires model updates:

Incorporate corrections immediately
Add confirmed in-scope additions
Update documentation for clarifications
Re-validate affected areas
Schedule follow-up review if changes are substantial

Most conceptual designs go through 2-3 review iterations before stabilizing.

Different Stakeholders, Different Views

Documenting the Initial Schema

A diagram alone doesn't constitute complete documentation. The initial schema should be documented thoroughly for future reference, team communication, and design continuity.

Essential Documentation Components

1. Entity Catalog

For each entity, document:

Name: Canonical entity name
Description: What this entity represents in business terms
Primary Key: Identifying attribute(s)
Attributes: Full list with descriptions
Constraints: Any entity-level rules
Notes: Decisions, assumptions, gotchas

Example:

Entity: Customer
Description: An individual or organization registered to make purchases
Primary Key: CustomerID
Attributes:
  - CustomerID: Unique identifier for the customer
  - Name: Full name of individual or organization name
  - Email: Primary contact email (unique)
  - RegistrationDate: Date customer account created
  - Status: Active, Inactive, or Suspended
Constraints:
  - Email must be unique across customers
  - Status must be one of the defined values
Notes: We treat individuals and organizations as the same entity type.
       Consider splitting if different attributes are needed.

2. Relationship Catalog

For each relationship, document:

Name: Relationship name
Entities: Participating entities with roles
Cardinality: From each entity's perspective
Participation: Mandatory or optional for each entity
Attributes: Any relationship attributes
Description: What this relationship means

Documentation Component Summary
Component	Purpose	Audience	Format
ER Diagram	Visual overview of model	All stakeholders	Drawing/diagram tool export
Entity Catalog	Detailed entity definitions	Designers, developers	Structured document/wiki
Relationship Catalog	Relationship specifications	Designers, developers	Structured document/wiki
Glossary	Term definitions	All stakeholders	Alphabetical list
Assumptions Log	Documented decisions	Future maintainers	Dated log entries
Requirements Traceability	Requirements to model mapping	Project managers, QA	Matrix/table
Open Issues	Unresolved questions	Project team	Issue tracking system

3. Glossary

Define all terms used in the model:

Entity names
Attribute names
Relationship names
Domain-specific terminology

A glossary ensures everyone interprets terms consistently. When a new team member asks "What's an SKU?", the glossary answers.

4. Assumptions and Decisions Log

Record the reasoning behind decisions:

Date: 2024-01-15
Decision: Customer and Organization are the same entity
Rationale: Current requirements don't distinguish them;
           attributes are identical; simplifies model.
Revisit if: Business needs different handling for B2B vs B2C.
           Consider generalization/specialization.

This log prevents revisiting settled issues and explains why the model looks the way it does.

5. Known Limitations

Document what the model doesn't do:

"This model tracks current state only; historical employee-department assignments are not tracked."
"Product variants (sizes, colors) are not modeled; each variant is a separate Product."
"Multi-currency pricing is out of scope."

Clear limitations prevent incorrect expectations.

Documentation Practices

Keep it current: Outdated documentation is worse than none—it misleads.
Store with the design: Documentation should be versioned alongside the model.
Use accessible formats: Wikis, markdown files, or collaborative docs that team members can access.
Link to source: Reference requirements documents, interview notes, and review meeting notes.

Documentation Enables Handoff

Quality Criteria for Initial Schemas

How do you know if your initial schema is good? Beyond basic correctness, quality schemas exhibit several characteristics that distinguish professional work from amateur attempts.

Criterion 1: Semantic Richness

The model captures meaning, not just structure:

Relationship names convey domain semantics ("purchases" vs. generic "has")
Cardinalities reflect actual business rules, not defaults
Constraints capture domain restrictions
The model distinguishes valid from invalid real-world states

A semantically rich model can be read as a description of the domain. Someone unfamiliar with the business should understand it from the model.

Criterion 2: Minimal Redundancy

Each fact is represented once:

No attribute appears in multiple entities unless it's a foreign key
No relationship path is duplicated
Derived data is marked as derived, not stored

Redundancy in conceptual models leads to redundancy in databases, causing update anomalies.

Criterion 3: Appropriate Abstraction

The model operates at the right level:

High enough to be technology-independent
Low enough to capture essential detail
No implementation artifacts (foreign key columns, index hints)
No application artifacts (screen fields, button actions)

Quality Evaluation Rubric

•Completeness (10 pts): All requirements traceable to model elements; no known gaps
•Accuracy (10 pts): Cardinalities, participations, and constraints match domain reality
•Semantic Richness (10 pts): Names are meaningful; model conveys domain understanding
•Minimal Redundancy (10 pts): No duplicate facts; derived data marked appropriately
•Abstraction Level (10 pts): Technology-independent; no implementation details
•Consistency (10 pts): Naming conventions followed; no contradictions
•Documentation (10 pts): Entities, relationships, assumptions documented
•Stakeholder Validation (10 pts): Reviewed and accepted by domain experts
•Evolvability (10 pts): Structure accommodates likely future changes
•Visual Clarity (10 pts): Diagram is readable, organized, and annotated

Criterion 4: Consistency

The model is internally coherent:

Naming conventions are uniform (all singular, all PascalCase, etc.)
No contradictory constraints (A requires B, but B requires A)
Notation is consistent throughout
Similar concepts are modeled similarly

Criterion 5: Evolvability

The model can grow:

Adding new entities doesn't require restructuring existing ones
Common extension points are designed in (e.g., status codes as entities allow adding new statuses)
Over-specialized structures are avoided where generalization may be needed

Criterion 6: Visual Clarity

The diagram communicates effectively:

Related entities are grouped spatially
Relationship lines don't cross unnecessarily
Notation is legible at review-presentation size
Colors or groupings (if used) convey information

Self-Assessment

Before presenting for review, rate your schema against these criteria:

Below 70 points: More work needed
70-85 points: Ready for review with known issues
85+ points: High confidence, ready for formal review

Quality Over Speed

Preparing for Logical Design

The initial schema will be transformed into a logical schema—typically a relational schema ready for implementation. Preparing for this transition ensures smooth continuity.

What Logical Design Will Do

Logical design transforms the conceptual model:

Entities → Tables: Each entity becomes one or more tables
Attributes → Columns: With specific data types
Relationships → Foreign Keys/Junction Tables: Implementing cardinality
Multivalued Attributes → Separate Tables: Normalization
Composite Attributes → Columns or Decomposition: Design choice
Specializations → Table Strategies: Single table, multiple tables, or combined

Information to Provide for Logical Design

To facilitate transition, ensure your conceptual design provides:

Clear Primary Keys: Every entity needs a clear identifier. Natural keys are preferred at conceptual level; surrogates may be added during logical design.
Cardinality and Participation: These determine foreign key placement and NULL constraints.
Weak Entity Indicators: Weak entities become tables with composite keys.
Relationship Attributes: These move to junction tables for M:N relationships.
Specialization Constraints: Disjoint/overlapping and total/partial guide table strategy.
Domain Constraints: Valid values for attributes guide CHECK constraints and lookup tables.

Conceptual to Logical Mapping Preview
Conceptual Element	Logical Representation	Considerations
Strong Entity	Table with primary key	Straightforward mapping
Weak Entity	Table with composite PK including owner's key	Foreign key to owner with CASCADE DELETE
1:1 Relationship	Foreign key in one table (usually total participation side)	Choose which side holds FK
1:N Relationship	Foreign key in 'N' side table	Standard pattern
M:N Relationship	Junction table with two foreign keys	Junction table may have additional columns for relationship attributes
Multivalued Attribute	Separate table with FK to owner	One row per value
Composite Attribute	Multiple columns or flattened	Design choice based on usage
Specialization	Single table, separate tables, or combined	Depends on overlap and coverage

Anticipating Normalization

Logical design includes normalization—ensuring the design avoids update anomalies. While normalization is a logical design activity, you can anticipate issues:

Repeated groups of attributes: May indicate entity missing (normalize to separate table)
Transitive dependencies: Attribute depends on non-key attribute (may need decomposition)
Composite attributes stored as single value: May complicate querying

A well-constructed conceptual model rarely needs significant restructuring during normalization—these issues are usually avoided by good entity identification.

Handoff Checklist

Before transitioning to logical design, verify:

All entities have primary keys identified
All relationships have cardinality specified
All relationships have participation constraints
Weak entities are marked with identifying relationships
Multivalued attributes are identified
Composite attributes have components listed
Specializations have constraints (disjoint/overlap, total/partial)
Business rules requiring CHECK constraints are documented
Any open questions are documented for resolution during logical design

Transition Meeting

If different people do conceptual and logical design, hold a handoff meeting:

Walk through the complete model
Explain assumptions and decisions
Highlight any areas of uncertainty
Identify open issues requiring resolution
Agree on documentation access and version control

Conceptual Design Is Not Wasted

Summary: Conceptual Design Mastery

We've completed the conceptual design journey—from understanding high-level modeling principles through creating a validated initial schema. Let's consolidate the full module learning:

Module Key Takeaways

•High-level modeling captures data semantics — Technology-independent representation focusing on meaning, not implementation.
•ER diagrams are the primary visual tool — Entities, relationships, and attributes in a notation understandable to technical and business stakeholders.
•Entity identification requires systematic techniques — Noun analysis, form mining, process analysis; distinguishing entities from attributes.
•Relationship specification is critical — Cardinality, participation, naming; special types like recursive and ternary relationships.
•Initial schemas bring it together — Complete, documented, validated models ready for stakeholder review.
•Validation is multi-layered — Requirements traceability, query walkthroughs, instance testing, edge case analysis.
•Documentation enables continuity — Entity and relationship catalogs, glossaries, assumptions logs.
•Quality criteria guide assessment — Completeness, accuracy, semantic richness, minimal redundancy, readiness for logical design.

The Conceptual Design Discipline

What's Next in the Database Design Journey

Mapping ER elements to relational tables
Applying normalization to eliminate redundancy
Specifying integrity constraints
Refining the design for specific database capabilities

Congratulations on Completing This Module

Module Complete