Loading content...
The Entity-Relationship diagram represents the conceptual blueprint of your database—a visual articulation of the real-world domain you're modeling. But an ER diagram, no matter how elegant, cannot store a single byte of data. To bridge the gap between conceptual design and operational database, we must master entity mapping: the systematic transformation of ER entities into relational tables.
This transformation is not merely a mechanical translation. It requires deep understanding of both the ER model's semantics and the relational model's capabilities and constraints. A poorly executed mapping can introduce anomalies, violate normalization principles, and create maintenance nightmares. Conversely, a well-executed mapping preserves all semantic information while creating efficient, maintainable database structures.
Regular entities—also called strong entities—form the backbone of any ER diagram. They exist independently, have their own identifying attributes, and serve as the foundation upon which relationships and weak entities depend. Mastering regular entity mapping is therefore the essential first step in the ER-to-relational transformation process.
By the end of this page, you will understand the complete theory and practice of regular entity mapping. You'll learn the formal algorithm, common patterns, edge cases, and how mapping decisions affect database quality. You'll be equipped to transform any regular entity from an ER diagram into a properly structured relational table.
Before we can map entities to relations, we must precisely define what constitutes a regular entity (synonymously called a strong entity). This distinction is fundamental because different entity types require different mapping strategies.
Definition: A regular entity is an entity type that has a key attribute capable of uniquely identifying each entity instance. Regular entities do not depend on other entities for their identification—they are existentially and identificationally independent.
Contrast with Weak Entities: Weak entities lack a complete key attribute. They depend on an owner entity (or identifying entity) for their identification. For example, a Room entity might be identified by its room number only in combination with the Building it belongs to. Weak entities require a different mapping strategy, covered in a later module.
Characteristics of Regular Entities:
Terminology matters. An entity type (or entity set) is the schema—the definition. An entity (or entity instance) is a specific occurrence. When we map entities to relations, we're mapping entity types to relation schemas. The actual entity instances become tuples (rows) in the resulting relation (table).
Examples of Regular Entities in Common Domains:
| Domain | Entity Type | Key Attribute | Sample Attributes |
|---|---|---|---|
| University | Student | student_id | name, email, enrollment_date |
| E-commerce | Product | product_id | name, price, category |
| Healthcare | Patient | patient_ssn | name, dob, address |
| Banking | Account | account_number | balance, type, open_date |
| HR | Employee | employee_id | name, hire_date, department |
Each of these entities is identifiable by its own attribute(s) and doesn't require another entity to establish its identity. This independence is what makes them "strong" or "regular" entities.
The transformation of a regular entity into a relation follows a precise algorithm. While the basic case is straightforward, the algorithm must account for various attribute types and edge cases. Here is the comprehensive, step-by-step procedure:
The algorithm processes attribute types in a specific order for good reason. Simple attributes are handled first because they map directly. Composite attributes are flattened before we consider storage. Multivalued and derived attributes are deferred because they require special handling that may involve creating additional tables. Following this order prevents confusion and ensures complete coverage.
Formal Notation:
Given an entity type E with:
The resulting relation R is:
R(K, A₁, A₂, ..., Aₙ, c₁₁, c₁₂, ..., c₂₁, c₂₂, ...)
with PRIMARY KEY(K)
Note that composite attributes contribute their component attributes, not themselves.
Let's work through a comprehensive example that illustrates each step of the mapping algorithm. Consider a university database with the following Student entity:
ER Diagram Description:
Step-by-Step Mapping:
Step 1: Create Relation
Create a new relation named Student (or Students if following pluralization convention).
Step 2: Map Simple Attributes
Include the simple attributes as columns:
student_id → VARCHAR or CHAR (key attribute)first_name → VARCHARlast_name → VARCHARemail → VARCHARdate_of_birth → DATEenrollment_date → DATEStep 3: Flatten Composite Attributes
The address composite attribute is decomposed into its components:
street → VARCHARcity → VARCHARstate → VARCHAR or CHAR(2)zip_code → VARCHAR or CHARcountry → VARCHARStep 4: Defer Multivalued Attribute
The phone_numbers attribute cannot be included directly. A separate table Student_Phone will be created (covered in the multivalued attributes page).
Step 5: Document Derived Attribute
The age attribute is not stored. It will be computed as needed: age = CURRENT_DATE - date_of_birth
Step 6: Designate Primary Key
student_id becomes the PRIMARY KEY.
Step 7-8: Define Types and Constraints
Apply appropriate types and constraints.
1234567891011121314151617181920212223242526272829303132333435363738394041424344
-- Regular Entity Mapping: Student-- Derived from ER diagram following the standard mapping algorithm CREATE TABLE Student ( -- Primary Key (from ER key attribute) student_id VARCHAR(20) PRIMARY KEY, -- Simple Attributes first_name VARCHAR(50) NOT NULL, last_name VARCHAR(50) NOT NULL, email VARCHAR(100) NOT NULL UNIQUE, date_of_birth DATE NOT NULL, enrollment_date DATE NOT NULL DEFAULT CURRENT_DATE, -- Flattened Composite Attribute: address street VARCHAR(100), city VARCHAR(50), state VARCHAR(50), zip_code VARCHAR(20), country VARCHAR(50) DEFAULT 'USA', -- Constraints CONSTRAINT chk_enrollment_after_birth CHECK (enrollment_date > date_of_birth), CONSTRAINT chk_valid_email CHECK (email LIKE '%@%.%')); -- Note: Multivalued attribute 'phone_numbers' requires separate table-- Note: Derived attribute 'age' is computed, not stored -- Index for common queriesCREATE INDEX idx_student_name ON Student(last_name, first_name);CREATE INDEX idx_student_enrollment ON Student(enrollment_date); -- Computed column for derived attribute (SQL Server syntax)-- ALTER TABLE Student ADD age AS DATEDIFF(YEAR, date_of_birth, GETDATE()); -- Or as a VIEW (portable approach)CREATE VIEW Student_With_Age ASSELECT *, EXTRACT(YEAR FROM AGE(CURRENT_DATE, date_of_birth)) AS ageFROM Student;The Student entity has been successfully mapped to a relational table. Note how the composite 'address' attribute was flattened into individual columns, and how derived and multivalued attributes were handled separately. This table is in at least 1NF (First Normal Form) since all attributes are atomic.
The selection and mapping of key attributes deserves special attention. In the ER model, an entity may have multiple candidate keys—minimal sets of attributes that uniquely identify each entity instance. When mapping to the relational model, we must make critical decisions about key representation.
Natural keys can change: emails get updated, SSNs can be reassigned, product codes get reformatted. When natural keys change, all foreign key references must be updated—potentially across dozens of tables. Surrogate keys with natural keys as UNIQUE constraints often provide the best of both worlds: referential stability plus business-meaningful uniqueness enforcement.
123456789101112131415161718192021222324252627282930313233
-- Example 1: Natural Key (as shown in ER diagram)CREATE TABLE Student_Natural ( student_id VARCHAR(20) PRIMARY KEY, -- University-assigned ID email VARCHAR(100) NOT NULL UNIQUE, -- Alternate key -- ... other attributes); -- Example 2: Surrogate Key (enhanced for physical design)CREATE TABLE Student_Surrogate ( id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, student_id VARCHAR(20) NOT NULL UNIQUE, -- Natural key preserved email VARCHAR(100) NOT NULL UNIQUE, -- Also unique -- ... other attributes); -- Example 3: Composite Key (when entity has multi-attribute key)CREATE TABLE Course_Section ( department_code VARCHAR(10), course_number VARCHAR(10), section_id VARCHAR(5), semester VARCHAR(10), year INTEGER, instructor_id BIGINT, room_id BIGINT, PRIMARY KEY (department_code, course_number, section_id, semester, year)); -- Note: Composite keys work correctly but:-- - Make foreign key references verbose-- - Increase index size-- - Complicate application code-- Consider surrogate key if this table is heavily referenced.While ER diagrams focus on conceptual attributes, relational tables require concrete data types. The mapping from conceptual domain to SQL type is a critical design decision that affects storage efficiency, query performance, and data integrity.
| Conceptual Domain | SQL Type Options | Considerations |
|---|---|---|
| Identifier (short) | CHAR(n), VARCHAR(n) | Fixed vs. variable length depends on whether all values have same length |
| Identifier (numeric) | INTEGER, BIGINT, SERIAL | Use SERIAL/IDENTITY for auto-generated keys |
| Text (short) | VARCHAR(50-255) | Always specify maximum length for performance |
| Text (long) | TEXT, VARCHAR(MAX) | For descriptions, notes, content fields |
| Whole number | SMALLINT, INTEGER, BIGINT | Choose based on expected value range |
| Decimal number | DECIMAL(p,s), NUMERIC(p,s) | Critical for financial values—avoid FLOAT/DOUBLE |
| Boolean | BOOLEAN, CHAR(1), TINYINT | Native BOOLEAN if supported; else CHAR('Y'/'N') or TINYINT(0/1) |
| Date only | DATE | No time component—birthdays, hire dates |
| Time only | TIME | No date component—store hours, schedules |
| Date and time | TIMESTAMP, DATETIME | Include timezone handling for global applications |
| Currency | DECIMAL(19,4), MONEY | Never use FLOAT for money—precision loss |
| VARCHAR(254) | RFC 5321 max length; add CHECK constraint for format | |
| Phone | VARCHAR(20) | Store as string to preserve formatting, international codes |
| URL | VARCHAR(2048) | Common max URL length; consider TEXT for very long URLs |
| Binary/Blob | BYTEA, BLOB, VARBINARY | For files, images—consider external storage with reference |
For DECIMAL types, precision (p) is total digits and scale (s) is digits after decimal. DECIMAL(10,2) stores values like 12345678.90. Always specify explicitly—database defaults vary and can cause subtle bugs. For financial applications, DECIMAL(19,4) is a safe choice that handles even cryptocurrency precision.
Vendor-Specific Considerations:
While SQL standards define core types, implementations vary:
When designing for portability, stick to widely-supported types (INTEGER, VARCHAR, DATE, DECIMAL) or create abstraction layers.
Consistent naming conventions enhance maintainability, reduce confusion, and prevent errors. While ER diagrams may use informal names, the relational implementation should follow strict conventions. There's no single "correct" convention, but there is value in consistency.
snake_case: Words separated by underscores, all lowercase.
Examples:
student, course_enrollment, order_itemstudent_id, first_name, date_of_birthpk_student, fk_enrollment_student, uq_student_emailAdvantages:
Avoid SQL reserved words as identifiers: ORDER, USER, TABLE, INDEX, GROUP, COMMENT, etc. If you must use them, quote the identifier ("ORDER" or [ORDER])—but this creates maintenance headaches. Better to choose different names: customer_order, app_user, table_name.
Even experienced designers make mapping errors. Understanding these common mistakes helps you avoid them and produce higher-quality database schemas.
phone_numbers VARCHAR(500) to store comma-separated values. This violates 1NF, prevents proper indexing, and complicates queries. Solution: Separate table with foreign key.address TEXT column. This makes it impossible to query or sort by city, state, or zip code individually. Solution: Flatten to component columns.age INTEGER as a stored column that must be updated whenever date_of_birth changes (or worse, annually for everyone). Solution: Compute at query time or use computed columns.name, value, type without context. These become ambiguous in joins. Solution: Prefix with entity context: student_name, config_value, account_type.12345678910
-- ❌ Anti-Pattern ExampleCREATE TABLE Student ( id INT, name VARCHAR(100), address TEXT, phones VARCHAR(500), -- CSV values! age INT, -- Derived, stored enrolled CHAR(1) -- 'Y' or 'N' or ??? -- No constraints!);1234567891011
-- ✓ Correct MappingCREATE TABLE Student ( student_id BIGINT PRIMARY KEY, first_name VARCHAR(50) NOT NULL, last_name VARCHAR(50) NOT NULL, street VARCHAR(100), city VARCHAR(50), state VARCHAR(50), date_of_birth DATE NOT NULL, is_enrolled BOOLEAN DEFAULT TRUE);Regular entity mapping is the foundation of ER-to-relational transformation. Master this process, and you're equipped to handle the most common mapping scenarios. Let's consolidate what we've learned:
What's Next:
The basic mapping process covers simple attributes and the overall entity-to-table transformation. However, real-world entities frequently include composite attributes that deserve deeper examination. On the next page, we'll explore Composite Attribute Mapping in comprehensive detail—including naming strategies, optional component handling, and complex nesting scenarios.
You now understand how to transform regular entities from ER diagrams into well-structured relational tables. This foundational skill applies to every database design project. Next, we'll examine composite attributes in greater depth to handle more complex entity structures.