Loading content...
The conceptual model—elegant, abstract, technology-independent—now requires transformation into something a database management system can actually implement. This is the domain of logical design: the systematic conversion of conceptual schemas into formal structures that define exactly what tables exist, what columns they contain, what constraints govern them, and how they relate through keys.
Logical design is where the theoretical becomes practical, where ER diagrams become CREATE TABLE statements, where cardinality constraints become foreign key relationships. It is a phase that requires both systematic precision (following well-defined mapping algorithms) and engineering judgment (making trade-offs for performance, maintainability, and clarity).
The logical schema you create here will persist for years or decades. Every application built on this database will be shaped by the decisions you make. A well-designed logical schema makes applications easier to build, queries faster to execute, and systems more maintainable. A poorly designed schema creates ongoing technical debt that compounds with every new feature.
By the end of this page, you will master the logical design process: mapping entities to tables, attributes to columns, relationships to keys, and constraints to database-level enforcement. You will understand normalization theory and when to apply it, learn systematic algorithms for ER-to-relational transformation, and develop the judgment to make sound trade-off decisions.
Logical design produces a logical schema—a formal specification of the database structure that is specific to a particular data model (usually relational) but still independent of any particular DBMS product or physical storage considerations.
Formal Definition:
A logical schema specifies the structure of data at the logical level of abstraction: tables (relations), columns (attributes), data types, primary keys, foreign keys, and integrity constraints. It describes what is stored without specifying how it is physically organized.
The Relational Model as Target:
The vast majority of logical design targets the relational model, characterized by:
Logical Design Objectives:
The ER-to-Relational Mapping Process:
Logical design follows a systematic transformation:
| Conceptual Element | Logical Element | Mapping Rule |
|---|---|---|
| Entity Type | Table | One table per strong entity |
| Attribute | Column | One column per simple attribute |
| Composite Attribute | Multiple Columns | Flatten or create structured type |
| Multi-valued Attribute | Separate Table | New table with foreign key to parent |
| Relationship | Foreign Key or Table | Depends on cardinality |
| Weak Entity | Table + Partial Key | Includes owner's primary key |
| Generalization | Various approaches | Single table, multiple tables, or hybrid |
These rules provide a systematic algorithm for transformation, though they often require judgment in application.
Modern database design tools (ERwin, Enterprise Architect, DbSchema, MySQL Workbench) automate ER-to-relational mapping. However, understanding the underlying algorithms is essential—automated tools make decisions that may not suit your specific requirements, and you must be able to evaluate and adjust their output.
The most straightforward mapping: each strong entity type becomes a table. The entity's simple attributes become columns, and the entity's key attribute(s) become the primary key.
Basic Entity Mapping Algorithm:
For each strong entity E:
1. Create table T with name derived from E
2. For each simple attribute A of E:
- Add column A to T with appropriate data type
3. For each composite attribute C of E:
- Add columns for each component of C
- OR create structured type (if DBMS supports)
4. For key attribute(s) K of E:
- Designate K column(s) as PRIMARY KEY
Example Transformation:
123456789101112131415161718
-- Conceptual: CUSTOMER entity-- Attributes: customer_id (key), name (composite: first, last), -- email, registration_date -- Logical Mapping:CREATE TABLE customers ( customer_id INTEGER PRIMARY KEY, first_name VARCHAR(50) NOT NULL, last_name VARCHAR(50) NOT NULL, email VARCHAR(255) NOT NULL UNIQUE, registration_date DATE NOT NULL DEFAULT CURRENT_DATE); -- Note the decisions made:-- 1. Table name: lowercase plural (convention)-- 2. Composite name → two columns (first_name, last_name)-- 3. Email marked UNIQUE (alternate key from conceptual model)-- 4. registration_date has default (implementation enhancement)Handling Composite Attributes:
Composite attributes present a design choice:
Option A: Flatten to multiple columns
-- Address as composite attribute
CREATE TABLE customers (
...
address_street VARCHAR(100),
address_city VARCHAR(50),
address_state CHAR(2),
address_zip VARCHAR(10),
address_country VARCHAR(50)
);
Advantages: Simple queries, wide tool support Disadvantages: Repetitive if same structure appears multiple times
Option B: Create structured type (if DBMS supports)
-- PostgreSQL composite type
CREATE TYPE address_type AS (
street VARCHAR(100),
city VARCHAR(50),
state CHAR(2),
zip VARCHAR(10),
country VARCHAR(50)
);
CREATE TABLE customers (
...
address address_type
);
Advantages: Reusable, cleaner schema Disadvantages: Limited DBMS support, complex querying
Handling Multi-valued Attributes:
Multi-valued attributes cannot be stored directly in a relational column (violates 1NF). They require a separate table:
123456789101112131415161718192021
-- Conceptual: CUSTOMER has multi-valued attribute phone_numbers -- WRONG: Storing multiple phones in one column-- CREATE TABLE customers (-- phone_numbers VARCHAR(500) -- "555-1234, 555-5678" WRONG!-- ); -- CORRECT: Separate table for multi-valued attributeCREATE TABLE customer_phones ( customer_id INTEGER NOT NULL, phone_number VARCHAR(20) NOT NULL, phone_type VARCHAR(10) DEFAULT 'mobile', is_primary BOOLEAN DEFAULT FALSE, PRIMARY KEY (customer_id, phone_number), FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ON DELETE CASCADE); -- Each row represents ONE phone for ONE customer-- Multiple rows for customers with multiple phonesDerived attributes (age from birth_date, order_total from line items) are typically NOT stored as columns. They're computed at query time using expressions or views. Exceptions exist for performance-critical derived values that are expensive to compute—these may be materialized with triggers to maintain consistency.
Relationship mapping is where logical design becomes nuanced. The mapping strategy depends on the relationship's cardinality, participation constraints, and whether the relationship has its own attributes.
Mapping Strategy by Cardinality:
One-to-Many (1:N) Mapping:
The most common relationship type. The foreign key is placed in the table on the "many" (N) side.
Algorithm:
Example: Department (1) — employs → (N) Employee
CREATE TABLE departments (
department_id INTEGER PRIMARY KEY,
name VARCHAR(100) NOT NULL,
budget DECIMAL(12,2)
);
CREATE TABLE employees (
employee_id INTEGER PRIMARY KEY,
name VARCHAR(100) NOT NULL,
hire_date DATE NOT NULL,
-- Foreign key implements 1:N relationship
department_id INTEGER NOT NULL,
FOREIGN KEY (department_id)
REFERENCES departments(department_id)
);
Participation Constraints:
NOT NULL on FK = Total participation (employee MUST have department)NULL allowed on FK = Partial participation (employee MAY have no department)Relationship Attributes: If the 'employs' relationship had attributes (e.g., position_title, start_date in department), they go in the employees table alongside the FK.
Weak Entity Mapping:
Weak entities depend on an owner entity for identification. Their table includes the owner's primary key as part of their own composite key:
1234567891011121314151617181920212223242526
-- Conceptual: ORDER (strong) --contains--> ORDER_LINE (weak)-- ORDER_LINE is identified by order_id + line_number CREATE TABLE orders ( order_id INTEGER PRIMARY KEY, customer_id INTEGER NOT NULL, order_date DATE NOT NULL, status VARCHAR(20) NOT NULL, FOREIGN KEY (customer_id) REFERENCES customers(customer_id)); CREATE TABLE order_lines ( order_id INTEGER NOT NULL, line_number INTEGER NOT NULL, -- Partial key (discriminator) product_id INTEGER NOT NULL, quantity INTEGER NOT NULL CHECK (quantity > 0), unit_price DECIMAL(10,2) NOT NULL, -- Composite primary key includes owner's PK PRIMARY KEY (order_id, line_number), -- Identifying relationship FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE, FOREIGN KEY (product_id) REFERENCES products(product_id));Specify ON DELETE and ON UPDATE actions for foreign keys: CASCADE (propagate changes), SET NULL (nullify reference), SET DEFAULT, RESTRICT (prevent operation), NO ACTION (defer check). These enforce referential integrity automatically. Weak entity FKs typically use CASCADE; other FKs often use RESTRICT or SET NULL depending on business rules.
Normalization is the systematic process of organizing tables to minimize redundancy and dependency anomalies. It provides a theoretical foundation for evaluating schema quality and guides decomposition decisions.
Why Normalize?
Unnormalized schemas suffer from anomalies:
Normal Forms Overview:
| Normal Form | Requirement | Eliminates | Practical Importance |
|---|---|---|---|
| 1NF | Atomic values, no repeating groups | Multi-valued attributes in cells | Essential — fundamental to relational model |
| 2NF | 1NF + No partial dependencies on composite key | Redundancy from partial key dependencies | Important for composite key tables |
| 3NF | 2NF + No transitive dependencies | Redundancy from non-key dependencies | Standard target for OLTP systems |
| BCNF | Every determinant is a candidate key | Remaining partial and transitive issues | Stronger version of 3NF |
| 4NF | No multi-valued dependencies | Independent multi-valued facts | Specialized situations |
| 5NF | No join dependencies | Lossy decomposition risks | Rarely needed in practice |
First Normal Form (1NF):
A table is in 1NF if:
Violation Example:
STUDENT(id, name, courses) -- courses = "CS101, CS102, CS201"
1NF Solution:
STUDENT(id, name)
STUDENT_COURSES(student_id, course) -- One row per course
Second Normal Form (2NF):
A table is in 2NF if:
Only relevant for tables with composite primary keys.
Violation Example:
ORDER_ITEMS(order_id, product_id, quantity, product_name, product_price)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
product_name and product_price depend only on product_id, not on (order_id, product_id)
2NF Solution:
ORDER_ITEMS(order_id, product_id, quantity)
PRODUCTS(product_id, product_name, product_price)
Third Normal Form (3NF):
A table is in 3NF if:
Memorize: "Every non-key attribute must depend on the key, the whole key, and nothing but the key."
Violation Example:
EMPLOYEES(emp_id, emp_name, dept_id, dept_name, dept_location)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dept_name and dept_location depend on dept_id, not emp_id
3NF Solution:
EMPLOYEES(emp_id, emp_name, dept_id)
DEPARTMENTS(dept_id, dept_name, dept_location)
Boyce-Codd Normal Form (BCNF):
A table is in BCNF if:
BCNF is stronger than 3NF. Tables in BCNF are automatically in 3NF, but not vice versa. The difference matters in rare cases with overlapping candidate keys.
For most OLTP (transaction processing) systems, normalizing to 3NF or BCNF is the standard. Higher normal forms (4NF, 5NF) address specialized scenarios rarely encountered in practice. OLAP (analytical) systems often deliberately denormalize for query performance. The key is understanding the trade-offs and making conscious decisions.
Constraints transform business rules into database-enforced guarantees. Properly specified constraints prevent invalid data states without relying on application code—providing defense in depth when applications have bugs or when data is modified through direct database access.
Constraint Categories:
| Constraint Type | Purpose | Scope | Example |
|---|---|---|---|
| NOT NULL | Attribute must have a value | Single column | email VARCHAR(255) NOT NULL |
| UNIQUE | No duplicate values allowed | Column(s) | UNIQUE (email) |
| PRIMARY KEY | Unique row identifier, implies NOT NULL | Column(s) | PRIMARY KEY (order_id) |
| FOREIGN KEY | Referential integrity between tables | Column(s) | FOREIGN KEY (dept_id) REFERENCES departments |
| CHECK | Custom boolean condition must be true | Row or column | CHECK (quantity > 0) |
| DEFAULT | Value when none specified | Single column | status VARCHAR(20) DEFAULT 'pending' |
CHECK Constraints for Business Rules:
CHECK constraints encode business rules directly in the schema:
123456789101112131415161718192021222324252627282930313233
-- Range constraintsCREATE TABLE products ( product_id INTEGER PRIMARY KEY, name VARCHAR(100) NOT NULL, price DECIMAL(10,2) NOT NULL CHECK (price >= 0), weight_kg DECIMAL(8,3) CHECK (weight_kg > 0), discount_pct DECIMAL(5,2) CHECK (discount_pct BETWEEN 0 AND 100)); -- Enumerated valuesCREATE TABLE orders ( order_id INTEGER PRIMARY KEY, status VARCHAR(20) NOT NULL CHECK (status IN ('pending', 'processing', 'shipped', 'delivered', 'cancelled'))); -- Cross-column constraintsCREATE TABLE date_ranges ( range_id INTEGER PRIMARY KEY, start_date DATE NOT NULL, end_date DATE, CHECK (end_date IS NULL OR end_date >= start_date)); -- Pattern matching (where supported)CREATE TABLE contacts ( contact_id INTEGER PRIMARY KEY, email VARCHAR(255) CHECK (email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'), phone VARCHAR(20) CHECK (phone ~ '^\+?[0-9\-\s]+$'));Advanced Constraint Patterns:
Conditional NOT NULL (nullable based on other column):
-- If status is 'shipped', ship_date must exist
CHECK (status != 'shipped' OR ship_date IS NOT NULL)
Mutually Exclusive Relationships:
-- A payment references either card OR bank account, not both
CHECK ((card_id IS NULL) != (bank_account_id IS NULL))
Exactly One Required:
-- Must have email OR phone OR both
CHECK (email IS NOT NULL OR phone IS NOT NULL)
Cross-Table Constraints (Triggers): Some constraints cannot be expressed with CHECK (they span multiple tables). These require triggers or application logic:
Implement constraints at multiple levels: database constraints as the ultimate safety net, application validation for user-friendly error messages, and frontend validation for immediate feedback. Database constraints catch issues that slip through application bugs or direct database access.
Generalization (supertype-subtype) hierarchies from the conceptual model can be mapped to relational tables in several ways. Each approach has trade-offs affecting query complexity, storage efficiency, and constraint enforcement.
Three Primary Mapping Strategies:
Strategy A: Single Table (Table-per-Hierarchy)
All entity types in the hierarchy are merged into one table with a type discriminator column.
CREATE TABLE persons (
person_id INTEGER PRIMARY KEY,
person_type VARCHAR(20) NOT NULL,
-- Common attributes
name VARCHAR(100) NOT NULL,
email VARCHAR(255),
-- Employee attributes (nullable)
hire_date DATE,
salary DECIMAL(10,2),
-- Customer attributes (nullable)
credit_limit DECIMAL(10,2),
loyalty_tier VARCHAR(20),
CHECK (person_type IN ('employee', 'customer'))
);
Pros:
Cons:
Strategy B: Multiple Tables (Table-per-Type)
Each entity type gets its own table. Subtypes reference supertype via foreign key.
CREATE TABLE persons (
person_id INTEGER PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(255)
);
CREATE TABLE employees (
person_id INTEGER PRIMARY KEY,
hire_date DATE NOT NULL,
salary DECIMAL(10,2) NOT NULL,
FOREIGN KEY (person_id)
REFERENCES persons(person_id)
);
CREATE TABLE customers (
person_id INTEGER PRIMARY KEY,
credit_limit DECIMAL(10,2),
loyalty_tier VARCHAR(20),
FOREIGN KEY (person_id)
REFERENCES persons(person_id)
);
Pros:
Cons:
Strategy C: Subtype Tables Only (Table-per-Concrete-Class)
No supertype table exists. Each subtype contains all attributes, including inherited ones.
CREATE TABLE employees (
employee_id INTEGER PRIMARY KEY,
-- Repeated from supertype
name VARCHAR(100) NOT NULL,
email VARCHAR(255),
-- Employee-specific
hire_date DATE NOT NULL,
salary DECIMAL(10,2) NOT NULL
);
CREATE TABLE customers (
customer_id INTEGER PRIMARY KEY,
-- Repeated from supertype
name VARCHAR(100) NOT NULL,
email VARCHAR(255),
-- Customer-specific
credit_limit DECIMAL(10,2),
loyalty_tier VARCHAR(20)
);
Pros:
Cons:
Strategy Selection Guidelines:
| Consideration | Single Table | Multiple Tables | Subtype Only |
|---|---|---|---|
| Few subtypes, many shared attributes | ✓ Best | Good | Acceptable |
| Many subtypes, distinct attributes | Avoid | ✓ Best | Good |
| Frequent polymorphic queries | ✓ Best | Acceptable | Avoid |
| Strict subtype constraints needed | Harder | ✓ Best | Good |
| Overlapping subtypes allowed | Complex | ✓ Best | Avoid |
| Performance-critical single-type queries | Good | Slower | ✓ Best |
Disjoint generalization constraints (entity can be in only one subtype) are difficult to enforce with table constraints alone. For Strategy B, preventing a person from being both employee AND customer requires either application logic, triggers, or a type discriminator in the supertype table with FK constraints.
A logical schema is not complete without documentation. The DDL alone doesn't capture the why behind design decisions, the business meaning of tables and columns, or the conventions followed. Comprehensive documentation ensures the schema remains maintainable as teams change.
Essential Documentation Elements:
Naming Conventions:
Consistent naming dramatically improves schema usability:
| Element | Convention (Example) | Notes |
|---|---|---|
| Tables | lowercase_plural | customers, order_lines |
| Columns | lowercase_snake_case | first_name, created_at |
| Primary Keys | table_singular_id | customer_id, order_id |
| Foreign Keys | referenced_table_id | customer_id in orders table |
| Junction Tables | entity1_entity2 | student_courses, user_roles |
| Indexes | idx_table_columns | idx_orders_customer_date |
| Constraints | type_table_column | chk_orders_status, fk_orders_customer |
In-Database Documentation:
1234567891011121314151617181920212223242526272829
-- PostgreSQL COMMENT syntax (most DBMS have equivalent) COMMENT ON TABLE customers IS 'Core customer entity representing individuals or businesses who purchase products. Source of truth for customer identity and contact information.'; COMMENT ON COLUMN customers.customer_id IS 'System-generated unique identifier. Never exposed externally; use customer_code for external references.'; COMMENT ON COLUMN customers.customer_code IS 'Human-readable customer identifier (CUS-XXXXXX format). Displayed on invoices, used in customer communications.'; COMMENT ON COLUMN customers.credit_limit IS 'Maximum outstanding balance allowed before requiring payment. Set by finance department based on credit assessment. NULL indicates no credit extended (cash-only customer).'; COMMENT ON COLUMN customers.created_at IS 'Timestamp when customer record was created. Immutable after insertion. Used for customer age calculations and anniversary notifications.'; -- Document constraintsCOMMENT ON CONSTRAINT chk_customers_email ON customers IS 'Ensures email follows valid format per RFC 5322. Applied after UI validation as defense-in-depth.';Treat schema documentation like code: version control it, review changes, and keep it synchronized with the actual schema. Tools like SchemaSpy, dbdocs.io, or built-in DBMS documentation features can generate reference documentation from comments. Out-of-date documentation is worse than no documentation—it misleads.
Logical design transforms abstract conceptual models into concrete, implementable database schemas. This phase requires both systematic application of mapping algorithms and engineering judgment for trade-off decisions.
What's Next:
With the logical schema defined—tables, columns, keys, and constraints—we enter the physical design phase. Here, we make decisions about how the logical schema will be physically implemented: storage structures, indexing strategies, partitioning schemes, and performance optimizations that transform a correct schema into a fast schema.
The next page explores physical design: index selection, partitioning strategies, storage parameters, and the performance considerations that bridge the gap between logical correctness and production readiness.
You now understand logical design as the systematic transformation of conceptual models into relational schemas. You can map entities to tables, apply normalization principles, specify constraints, choose generalization strategies, and document your decisions. Next, we'll optimize this schema for real-world performance through physical design.