Database Management SystemsGeneralization

Generalization in EER Modeling

LevelIntermediate

Duration75 mins

TopicGeneralization

4 / 5

Supertype Creation

Giving Form to Abstraction

After analyzing entity types, identifying common attributes, and validating semantic relationships, the generalization process reaches its creative zenith: creating the supertype. This is the moment when abstract insight becomes concrete schema—when the observation that 'these things share a common identity' transforms into a new entity type that formally captures that identity.

Creating a supertype is more than just drawing a new rectangle in an ER diagram. It requires careful decisions about naming, identity, primary keys, inherited versus local attributes, relationship migration, and integration with the existing schema. A well-designed supertype feels natural and inevitable; a poorly designed one creates confusion and maintenance burden.

This page provides comprehensive guidance on supertype creation—from initial conception through complete schema integration. By the end, you will be able to create supertypes that are semantically meaningful, technically sound, and practically useful.

What You Will Learn

By the end of this page, you will master the complete process of supertype creation: naming conventions, primary key design, attribute assignment, relationship restructuring, constraint definition, documentation, and integration with the broader database schema.

Naming the Supertype

The supertype's name is its first and most visible design decision. A well-chosen name communicates the generalization's purpose instantly; a poorly chosen name confuses users and developers for years to come.

Naming Principles:

Principle 1: Use a Natural Domain Concept

The supertype name should be a term that domain experts naturally use when speaking generally about all subtypes:

✅ Good: EMPLOYEE (for hourly, salaried, contract workers) ✅ Good: ACCOUNT (for checking, savings, money market) ✅ Good: VEHICLE (for car, truck, motorcycle)

❌ Bad: WORKER_BASE_CLASS ❌ Bad: ACCOUNT_SUPERTYPE ❌ Bad: THING_WITH_WHEELS

Principle 2: Prefer Singular Nouns

Entity names in database design conventionally use singular form:

✅ Good: PERSON, ACCOUNT, PRODUCT ❌ Bad: PERSONS, ACCOUNTS, PRODUCTS

Principle 3: Avoid Technical Jargon

The name should make sense to business users, not just database designers:

✅ Good: PARTY (standard term for customer/supplier/partner) ❌ Bad: BUSINESS_ENTITY_ABSTRACTION

Principle 4: Consider Scope and Future Growth

Choose a name broad enough to accommodate future subtypes without renaming:

✅ Good: PAYMENT_METHOD (cards, bank transfers, digital wallets, future methods) ❌ Bad: CARD_OR_BANK (too specific, excludes future payment types)

Supertype Naming Examples
Subtypes	Weak Names	Strong Name	Rationale
CHECKING, SAVINGS, CD	ACCOUNT_TYPES, BANK_STUFF	ACCOUNT	Natural banking term
PHYSICIAN, NURSE, TECHNICIAN	MEDICAL_WORKERS, CLINICAL_STAFF	HEALTHCARE_PROVIDER	Industry-standard term
CUSTOMER, SUPPLIER, PARTNER	BUSINESS_ENTITIES, EXTERNALS	PARTY	Standard business modeling term
CAR, TRUCK, MOTORCYCLE, BUS	WHEELED_THINGS, AUTOS	VEHICLE	Universal transportation term
LAPTOP, DESKTOP, SERVER	MACHINES, COMPUTING	COMPUTER	Clear, natural term

The Dictionary Test

A good supertype name appears in domain dictionaries, industry glossaries, or common business vocabulary. If you can't find your proposed name in relevant reference materials, and domain experts don't naturally use it, reconsider the name—or reconsider whether the generalization is valid.

Designing the Supertype Primary Key

The supertype's primary key is the foundation of entity identity within the hierarchy. This key must uniquely identify every instance across all subtypes and serve as the basis for referential integrity throughout the schema.

Key Design Options:

Option 1: Unified Surrogate Key

Create a new synthetic key for the supertype that replaces all subtype keys:

CREATE TABLE party (
    party_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    party_type VARCHAR(20) NOT NULL,  -- discriminator
    name VARCHAR(200),
    email VARCHAR(255),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Advantages:

Simple, uniform key across hierarchy
No collisions between former subtype keys
Works with any physical implementation strategy

Disadvantages:

Requires migrating existing data with new keys
External systems must update references
Loses meaningful business identifiers

Option 2: Inherited Business Key

Use an existing business key that's consistent across subtypes:

-- All subtypes already used consistent ID format
CREATE TABLE person (
    person_id VARCHAR(10) PRIMARY KEY,  -- format: 'P-1234567'
    person_type VARCHAR(20) NOT NULL,
    ...
);

Advantages:

Preserves meaningful identifiers
Minimal migration required
Familiar to existing users and systems

Disadvantages:

Only works if subtypes had compatible key schemes
May require reformatting inconsistent keys
Business key changes ripple through hierarchy

Option 3: Composite Key with Type Discriminator

Combine original keys with a type indicator to ensure uniqueness:

CREATE TABLE account (
    account_type CHAR(1) NOT NULL,     -- 'C'=checking, 'S'=savings
    account_number VARCHAR(20) NOT NULL,
    ...
    PRIMARY KEY (account_type, account_number)
);

Advantages:

Preserves original identifiers
Handles key collisions between subtypes
Clear subtype identification in key itself

Disadvantages:

Composite keys are more complex to reference
Joins require multiple columns
Type column redundantly included in key and as discriminator

Option 4: Prefixed/Transformed Key

Transform original keys to include type prefix:

-- Original: CUSTOMER.id=12345, SUPPLIER.id=12345 (collision!)
-- Transformed:
INSERT INTO party (party_id, ...) VALUES ('CUS-12345', ...);  -- customer
INSERT INTO party (party_id, ...) VALUES ('SUP-12345', ...);  -- supplier

Advantages:

Single-column key
Visual type identification
Preserves original ID info

Disadvantages:

Key format is denormalized
Parsing required to extract components
Existing references need transformation

Key Collision Risk

If subtypes had independent ID sequences, the same ID might exist in multiple subtypes. CUSTOMER #1001 and SUPPLIER #1001 are different entities. The supertype key design MUST handle this collision—typically via type prefix, composite key, or new surrogate key.

Key Design Decision Factors

•Existing Key Compatibility — Can subtype keys coexist without collision? If yes, consider inherited key. If no, transformation is required.
•External References — How many external systems reference current IDs? Many references favor preserving recognizable ID formats.
•Query Patterns — Do queries often need subtype information? Composite or prefixed keys provide that inline.
•Future Subtypes — Will new subtypes be added? Unified surrogate keys accommodate them most easily.
•Physical Implementation — Single-table inheritance works with any key; class-table requires consistent FK structure.

Type Discriminator Design

A type discriminator (also called type indicator, subtype flag, or discriminator column) is an attribute in the supertype that identifies which subtype each instance belongs to. This is essential for determining how to interpret subtype-specific data and which business rules apply.

Discriminator Design Options:

Option 1: Single Character Code

account_type CHAR(1) CHECK (account_type IN ('C', 'S', 'M', 'D'))
-- C=Checking, S=Savings, M=Money Market, D=CD

Pros: Minimal storage, fast comparison Cons: Cryptic, requires documentation, limited to 26/36 values

Option 2: Short Descriptive Code

account_type VARCHAR(10) CHECK (account_type IN ('CHECKING', 'SAVINGS', 'MMARKET', 'CD'))

Pros: Readable, self-documenting Cons: More storage, potential for typos

Option 3: Numeric Code

account_type_id SMALLINT REFERENCES account_type_lookup(type_id)

Pros: Efficient storage, enforced via FK Cons: Requires lookup for meaning, join for labels

Option 4: Enum Type (if supported)

CREATE TYPE account_type AS ENUM ('CHECKING', 'SAVINGS', 'MONEY_MARKET', 'CD');
account_type account_type NOT NULL

Pros: Type-safe, readable, validated Cons: Database-specific, harder to modify

Discriminator Strategies Comparison
Strategy	Storage	Readability	Extensibility	Query Pattern
CHAR(1) code	1 byte	Poor	Limited (26/36)	WHERE type='C'
VARCHAR code	Variable	Good	Flexible	WHERE type='CHECKING'
Numeric FK	2-4 bytes	Via join	Easy (insert row)	WHERE type_id=1
ENUM	1-2 bytes	Good	Schema change	WHERE type='CHECKING'
Boolean flags	N booleans	Good	Schema change	WHERE is_checking=TRUE

Multiple Discriminators (for Overlapping Hierarchies):

When subtypes can overlap (an entity can belong to multiple subtypes simultaneously), multiple discriminators or flag columns are needed:

CREATE TABLE university_person (
    person_id UUID PRIMARY KEY,
    -- Multiple roles possible
    is_student BOOLEAN DEFAULT FALSE,
    is_faculty BOOLEAN DEFAULT FALSE,
    is_staff BOOLEAN DEFAULT FALSE,
    is_researcher BOOLEAN DEFAULT FALSE,
    -- At least one must be true
    CHECK (is_student OR is_faculty OR is_staff OR is_researcher)
);

This accommodates: a faculty member who is also a researcher, a student who is also staff (work-study), etc.

Discriminator Constraints:

The discriminator should have appropriate constraints:

-- For disjoint subtypes (exclusive)
account_type NOT NULL CHECK (account_type IN ('CHECKING', 'SAVINGS', 'CD'))

-- For overlapping subtypes
CHECK (is_student OR is_faculty OR is_staff)  -- total participation
-- no check needed for which combinations allowed (overlapping)

Discriminator Location

The discriminator always belongs in the supertype table, not subtype tables. It identifies which subtype logic applies to each supertype instance, enabling polymorphic queries and appropriate join decisions.

Final Attribute Assignment

With the supertype named, keyed, and discriminated, we now formally assign attributes to the supertype and subtypes. This step crystallizes the decisions made during common attribute analysis.

Attribute Assignment Rules

•Supertype Receives Common Attributes — All attributes shared by all (or nearly all) subtypes move to the supertype. These are inherited by every subtype.
•Subtypes Retain Specific Attributes — Attributes unique to a subtype remain in that subtype's definition. These are the 'local' attributes.
•Remove Duplicates from Subtypes — After moving common attributes to supertype, remove them from subtype definitions. They're now inherited.
•Handle Edge Cases with Nullability — Attributes in most-but-not-all subtypes can move to supertype as nullable; those subtypes use NULL.
•Preserve Attribute Metadata — Constraints, defaults, and documentation should accompany attributes to their new locations.

attribute-assignment-example.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
-- Before Generalization: Separate entity definitions
 
-- FULL_TIME_EMPLOYEE
-- emp_id, first_name, last_name, email, hire_date, department,
-- annual_salary, bonus_percent, vacation_days, stock_options
 
-- PART_TIME_EMPLOYEE  
-- emp_id, first_name, last_name, email, hire_date, department,
-- hourly_rate, max_hours_per_week
 
-- CONTRACTOR
-- emp_id, first_name, last_name, email, start_date, department,
-- contract_rate, contract_end_date, agency_name
 
-- After Generalization: Attribute assignment
 
-- EMPLOYEE (supertype)
CREATE TABLE employee (
    emp_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    employee_type VARCHAR(20) NOT NULL CHECK (employee_type IN ('FULL_TIME', 'PART_TIME', 'CONTRACTOR')),
    
    -- Common attributes (inherited by all)
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(255) NOT NULL UNIQUE,
    start_date DATE NOT NULL,        -- unified from hire_date/start_date
    department VARCHAR(100),
    
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
 
-- FULL_TIME_EMPLOYEE (subtype) - specific attributes only
CREATE TABLE full_time_employee (
    emp_id UUID PRIMARY KEY REFERENCES employee(emp_id) ON DELETE CASCADE,
    annual_salary DECIMAL(12,2) NOT NULL,
    bonus_percent DECIMAL(5,2) DEFAULT 0,
    vacation_days INT DEFAULT 15,
    stock_options INT DEFAULT 0
);
 
-- PART_TIME_EMPLOYEE (subtype) - specific attributes only
CREATE TABLE part_time_employee (
    emp_id UUID PRIMARY KEY REFERENCES employee(emp_id) ON DELETE CASCADE,
    hourly_rate DECIMAL(8,2) NOT NULL,
    max_hours_per_week INT DEFAULT 20
);
 
-- CONTRACTOR (subtype) - specific attributes only  
CREATE TABLE contractor (
    emp_id UUID PRIMARY KEY REFERENCES employee(emp_id) ON DELETE CASCADE,
    contract_rate DECIMAL(10,2) NOT NULL,
    contract_end_date DATE,
    agency_name VARCHAR(200)
);

Naming Unification

Notice how 'hire_date' (employees) and 'start_date' (contractors) were unified to 'start_date' in the supertype. This is a common attribute with different names that was reconciled during common attribute analysis.

Relationship Restructuring

Generalization affects not only attributes but also relationships. Common relationships must be migrated to the supertype, while subtype-specific relationships remain with their subtypes.

Relationship Migration Principles:

Principle 1: Move Common Relationships to Supertype

If all subtypes participate in the same relationship to another entity, that relationship should connect to the supertype:

Before:
  FULL_TIME ──(works_in)──→ DEPARTMENT
  PART_TIME ──(works_in)──→ DEPARTMENT
  CONTRACTOR ──(works_in)──→ DEPARTMENT

After:
  EMPLOYEE ──(works_in)──→ DEPARTMENT

Principle 2: Keep Subtype-Specific Relationships Local

Relationships that only apply to certain subtypes remain with those subtypes:

  CONTRACTOR ──(contracted_through)──→ AGENCY
  -- Only contractors have agencies; keep at subtype level

Principle 3: Foreign Key References Should Target Supertype

Other entities that previously referenced multiple subtype tables now reference the single supertype:

-- Before: Three FKs (or UNION queries)
CREATE TABLE project_assignment (
    assignment_id UUID PRIMARY KEY,
    worker_type VARCHAR(20),
    full_time_emp_id UUID REFERENCES full_time_employee,
    part_time_emp_id UUID REFERENCES part_time_employee,
    contractor_id UUID REFERENCES contractor,
    ...
);

-- After: Single FK to supertype
CREATE TABLE project_assignment (
    assignment_id UUID PRIMARY KEY,
    employee_id UUID NOT NULL REFERENCES employee(emp_id),  -- any employee type
    ...
);

Converting Mermaid diagram...

Relationship Semantics May Differ

Verify that 'same-named' relationships truly mean the same thing. FULL_TIME 'reports_to' MANAGER and CONTRACTOR 'reports_to' MANAGER might have different implications: employee has ongoing reporting; contractor may have project-based oversight. If semantics differ significantly, keep as separate relationships.

Supertype Constraint Definition

Constraints on the supertype establish baseline rules that apply to all subtypes. Additional constraints can be added at the subtype level for more specific requirements.

Supertype Constraint Categories

•Discriminator Validity — Ensure the type discriminator contains only valid values: CHECK (employee_type IN ('FULL_TIME', 'PART_TIME', 'CONTRACTOR'))
•Universal NOT NULLs — Attributes required in ALL subtypes should be NOT NULL in supertype: name VARCHAR(100) NOT NULL
•Global Uniqueness — Attributes unique across the entire hierarchy: email VARCHAR(255) UNIQUE (no duplicate emails regardless of employee type)
•Domain Constraints — Value range checks applicable to all: CHECK (start_date <= CURRENT_DATE) (can't start in future)
•Referential Integrity — Foreign keys valid for all subtypes: department_id REFERENCES department(id) (all employees belong to departments)

supertype-constraints.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
-- Comprehensive supertype constraint definition
 
CREATE TABLE employee (
    emp_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    
    -- Discriminator with valid values
    employee_type VARCHAR(20) NOT NULL 
        CHECK (employee_type IN ('FULL_TIME', 'PART_TIME', 'CONTRACTOR')),
    
    -- Universal requirements (all employees have these)
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(255) NOT NULL,
    start_date DATE NOT NULL,
    
    -- Optional for some subtypes (nullable in supertype)
    department_id UUID,
    manager_id UUID,
    
    -- Global uniqueness
    CONSTRAINT unique_employee_email UNIQUE (email),
    
    -- Domain constraints applying to all
    CONSTRAINT valid_start_date CHECK (start_date <= CURRENT_DATE),
    CONSTRAINT valid_email_format CHECK (email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'),
    
    -- Referential integrity (self-referential)
    CONSTRAINT fk_manager FOREIGN KEY (manager_id) 
        REFERENCES employee(emp_id) ON DELETE SET NULL,
    
    -- Referential integrity (to other entities)
    CONSTRAINT fk_department FOREIGN KEY (department_id)
        REFERENCES department(dept_id) ON DELETE SET NULL,
    
    -- Audit columns
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
 
-- Subtype-specific constraint (only full-time must have department)
CREATE TABLE full_time_employee (
    emp_id UUID PRIMARY KEY REFERENCES employee(emp_id) ON DELETE CASCADE,
    annual_salary DECIMAL(12,2) NOT NULL,
    -- Subtype-specific: full-time must have department
    -- This is enforced via trigger or application since department is in supertype
    CONSTRAINT valid_salary CHECK (annual_salary > 0)
);
 
-- Create trigger to enforce full-time department requirement
CREATE OR REPLACE FUNCTION enforce_fulltime_department()
RETURNS TRIGGER AS $$
BEGIN
    IF NOT EXISTS (
        SELECT 1 FROM employee 
        WHERE emp_id = NEW.emp_id AND department_id IS NOT NULL
    ) THEN
        RAISE EXCEPTION 'Full-time employees must have a department';
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
 
CREATE TRIGGER trg_fulltime_department
BEFORE INSERT OR UPDATE ON full_time_employee
FOR EACH ROW EXECUTE FUNCTION enforce_fulltime_department();

Cross-Table Constraints

When subtype constraints reference supertype attributes (like 'full-time must have department'), they often require triggers or application-level enforcement. Pure DDL constraints can only reference columns within the same table.

Schema Integration

Creating the supertype in isolation is insufficient; it must be properly integrated into the existing database schema. This involves updating references, creating appropriate views, and establishing migration paths.

Integration Tasks

•Update Foreign Key References — All tables that referenced individual subtypes should now reference the supertype. Create migration scripts to update FK constraints.
•Create Compatibility Views — For backward compatibility, create views that replicate the original subtype tables by joining supertype and subtype-specific tables.
•Update Application Code — Applications must be updated to work with the new hierarchy. This may involve query changes, ORM mapping updates, and API modifications.
•Migrate Existing Data — Transform data from old separate tables into the new hierarchical structure. Assign surrogate keys if needed, set discriminator values.
•Update Indexes — Create appropriate indexes on the supertype for common query patterns. Consider covering indexes for frequent subtype-filtered queries.
•Revise Security Policies — Row-level security, grants, and access controls may need updating to accommodate the new structure.

schema-integration-migration.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
-- Schema Integration Migration Script
 
-- Step 1: Create the new supertype structure (shown previously)
-- ... employee, full_time_employee, part_time_employee, contractor ...
 
-- Step 2: Migrate data from legacy tables
INSERT INTO employee (emp_id, employee_type, first_name, last_name, email, start_date, department_id)
SELECT 
    gen_random_uuid(),
    'FULL_TIME',
    first_name,
    last_name,
    email,
    hire_date,
    department_id
FROM legacy_full_time_employee;
 
-- Similarly for other subtypes...
 
-- Step 3: Populate subtype tables with specific attributes
INSERT INTO full_time_employee (emp_id, annual_salary, bonus_percent, vacation_days)
SELECT 
    e.emp_id,
    l.annual_salary,
    l.bonus_percent,
    l.vacation_days
FROM employee e
JOIN legacy_full_time_employee l ON e.email = l.email
WHERE e.employee_type = 'FULL_TIME';
 
-- Step 4: Create backward-compatible views
CREATE VIEW v_full_time_employee AS
SELECT 
    e.emp_id,
    e.first_name,
    e.last_name,
    e.email,
    e.start_date AS hire_date,  -- original name
    e.department_id,
    ft.annual_salary,
    ft.bonus_percent,
    ft.vacation_days,
    ft.stock_options
FROM employee e
JOIN full_time_employee ft ON e.emp_id = ft.emp_id
WHERE e.employee_type = 'FULL_TIME';
 
-- Step 5: Update foreign keys in dependent tables
-- Example: project_assignment previously had worker_type + multiple FKs
ALTER TABLE project_assignment 
    DROP COLUMN worker_type,
    DROP COLUMN full_time_emp_id,
    DROP COLUMN part_time_emp_id,
    DROP COLUMN contractor_id;
 
ALTER TABLE project_assignment
    ADD COLUMN employee_id UUID REFERENCES employee(emp_id);
 
-- Migrate assignment data...
 
-- Step 6: Create useful indexes
CREATE INDEX idx_employee_type ON employee(employee_type);
CREATE INDEX idx_employee_department ON employee(department_id); 
CREATE INDEX idx_employee_email_lower ON employee(LOWER(email));

Summary: Supertype Creation

We've covered the complete process of supertype creation—from naming through schema integration. Let's consolidate the essential practices:

Key Takeaways

•Name supertypes using natural domain concepts — The name should be a term domain experts use naturally. If you struggle to name it, the generalization may not be valid.
•Design primary keys to handle legacy data — Address key collisions between former subtypes. Options include surrogate keys, composite keys, or prefixed keys.
•Include a type discriminator — The discriminator identifies which subtype each instance belongs to. Choose a format that balances readability with efficiency.
•Assign attributes correctly — Common attributes go to supertype; specific attributes stay in subtypes. Remove duplicates from subtypes after migration.
•Restructure relationships appropriately — Common relationships connect to supertype; subtype-specific relationships remain local. Other entities should reference the supertype.
•Define constraints at appropriate levels — Universal constraints in supertype; specific constraints in subtypes or via triggers.
•Integrate carefully with existing schema — Update foreign keys, create compatibility views, migrate data, and update applications systematically.

What's Next:

Having mastered supertype creation, our final topic examines the comparison with specialization—understanding how generalization and specialization relate as inverse operations, when to use each approach, and how they converge to produce the same hierarchical structures from different starting points.

Page Complete

You now have comprehensive knowledge of supertype creation. You can name supertypes appropriately, design identity mechanisms, structure attributes and relationships correctly, define appropriate constraints, and integrate the supertype into an existing schema. Next, we'll compare generalization with its inverse operation: specialization.

4 / 5

Loading learning content...

Database Management SystemsGeneralization

Generalization in EER Modeling

LevelIntermediate

Duration75 mins

TopicGeneralization

4 / 5

Supertype Creation

Giving Form to Abstraction

What You Will Learn

Naming the Supertype

Naming Principles:

Principle 1: Use a Natural Domain Concept

The supertype name should be a term that domain experts naturally use when speaking generally about all subtypes:

✅ Good: EMPLOYEE (for hourly, salaried, contract workers) ✅ Good: ACCOUNT (for checking, savings, money market) ✅ Good: VEHICLE (for car, truck, motorcycle)

❌ Bad: WORKER_BASE_CLASS ❌ Bad: ACCOUNT_SUPERTYPE ❌ Bad: THING_WITH_WHEELS

Principle 2: Prefer Singular Nouns

Entity names in database design conventionally use singular form:

✅ Good: PERSON, ACCOUNT, PRODUCT ❌ Bad: PERSONS, ACCOUNTS, PRODUCTS

Principle 3: Avoid Technical Jargon

The name should make sense to business users, not just database designers:

✅ Good: PARTY (standard term for customer/supplier/partner) ❌ Bad: BUSINESS_ENTITY_ABSTRACTION

Principle 4: Consider Scope and Future Growth

Choose a name broad enough to accommodate future subtypes without renaming:

✅ Good: PAYMENT_METHOD (cards, bank transfers, digital wallets, future methods) ❌ Bad: CARD_OR_BANK (too specific, excludes future payment types)

Supertype Naming Examples
Subtypes	Weak Names	Strong Name	Rationale
CHECKING, SAVINGS, CD	ACCOUNT_TYPES, BANK_STUFF	ACCOUNT	Natural banking term
PHYSICIAN, NURSE, TECHNICIAN	MEDICAL_WORKERS, CLINICAL_STAFF	HEALTHCARE_PROVIDER	Industry-standard term
CUSTOMER, SUPPLIER, PARTNER	BUSINESS_ENTITIES, EXTERNALS	PARTY	Standard business modeling term
CAR, TRUCK, MOTORCYCLE, BUS	WHEELED_THINGS, AUTOS	VEHICLE	Universal transportation term
LAPTOP, DESKTOP, SERVER	MACHINES, COMPUTING	COMPUTER	Clear, natural term

The Dictionary Test

Designing the Supertype Primary Key

Key Design Options:

Option 1: Unified Surrogate Key

Create a new synthetic key for the supertype that replaces all subtype keys:

CREATE TABLE party (
    party_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    party_type VARCHAR(20) NOT NULL,  -- discriminator
    name VARCHAR(200),
    email VARCHAR(255),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Advantages:

Simple, uniform key across hierarchy
No collisions between former subtype keys
Works with any physical implementation strategy

Disadvantages:

Requires migrating existing data with new keys
External systems must update references
Loses meaningful business identifiers

Option 2: Inherited Business Key

Use an existing business key that's consistent across subtypes:

-- All subtypes already used consistent ID format
CREATE TABLE person (
    person_id VARCHAR(10) PRIMARY KEY,  -- format: 'P-1234567'
    person_type VARCHAR(20) NOT NULL,
    ...
);

Advantages:

Preserves meaningful identifiers
Minimal migration required
Familiar to existing users and systems

Disadvantages:

Only works if subtypes had compatible key schemes
May require reformatting inconsistent keys
Business key changes ripple through hierarchy

Option 3: Composite Key with Type Discriminator

Combine original keys with a type indicator to ensure uniqueness:

CREATE TABLE account (
    account_type CHAR(1) NOT NULL,     -- 'C'=checking, 'S'=savings
    account_number VARCHAR(20) NOT NULL,
    ...
    PRIMARY KEY (account_type, account_number)
);

Advantages:

Preserves original identifiers
Handles key collisions between subtypes
Clear subtype identification in key itself

Disadvantages:

Composite keys are more complex to reference
Joins require multiple columns
Type column redundantly included in key and as discriminator

Option 4: Prefixed/Transformed Key

Transform original keys to include type prefix:

-- Original: CUSTOMER.id=12345, SUPPLIER.id=12345 (collision!)
-- Transformed:
INSERT INTO party (party_id, ...) VALUES ('CUS-12345', ...);  -- customer
INSERT INTO party (party_id, ...) VALUES ('SUP-12345', ...);  -- supplier

Advantages:

Single-column key
Visual type identification
Preserves original ID info

Disadvantages:

Key format is denormalized
Parsing required to extract components
Existing references need transformation

Key Collision Risk

Key Design Decision Factors

•Existing Key Compatibility — Can subtype keys coexist without collision? If yes, consider inherited key. If no, transformation is required.
•External References — How many external systems reference current IDs? Many references favor preserving recognizable ID formats.
•Query Patterns — Do queries often need subtype information? Composite or prefixed keys provide that inline.
•Future Subtypes — Will new subtypes be added? Unified surrogate keys accommodate them most easily.
•Physical Implementation — Single-table inheritance works with any key; class-table requires consistent FK structure.

Type Discriminator Design

Discriminator Design Options:

Option 1: Single Character Code

account_type CHAR(1) CHECK (account_type IN ('C', 'S', 'M', 'D'))
-- C=Checking, S=Savings, M=Money Market, D=CD

Pros: Minimal storage, fast comparison Cons: Cryptic, requires documentation, limited to 26/36 values

Option 2: Short Descriptive Code

account_type VARCHAR(10) CHECK (account_type IN ('CHECKING', 'SAVINGS', 'MMARKET', 'CD'))

Pros: Readable, self-documenting Cons: More storage, potential for typos

Option 3: Numeric Code

account_type_id SMALLINT REFERENCES account_type_lookup(type_id)

Pros: Efficient storage, enforced via FK Cons: Requires lookup for meaning, join for labels

Option 4: Enum Type (if supported)

CREATE TYPE account_type AS ENUM ('CHECKING', 'SAVINGS', 'MONEY_MARKET', 'CD');
account_type account_type NOT NULL

Pros: Type-safe, readable, validated Cons: Database-specific, harder to modify

Discriminator Strategies Comparison
Strategy	Storage	Readability	Extensibility	Query Pattern
CHAR(1) code	1 byte	Poor	Limited (26/36)	WHERE type='C'
VARCHAR code	Variable	Good	Flexible	WHERE type='CHECKING'
Numeric FK	2-4 bytes	Via join	Easy (insert row)	WHERE type_id=1
ENUM	1-2 bytes	Good	Schema change	WHERE type='CHECKING'
Boolean flags	N booleans	Good	Schema change	WHERE is_checking=TRUE

Multiple Discriminators (for Overlapping Hierarchies):

When subtypes can overlap (an entity can belong to multiple subtypes simultaneously), multiple discriminators or flag columns are needed:

CREATE TABLE university_person (
    person_id UUID PRIMARY KEY,
    -- Multiple roles possible
    is_student BOOLEAN DEFAULT FALSE,
    is_faculty BOOLEAN DEFAULT FALSE,
    is_staff BOOLEAN DEFAULT FALSE,
    is_researcher BOOLEAN DEFAULT FALSE,
    -- At least one must be true
    CHECK (is_student OR is_faculty OR is_staff OR is_researcher)
);

This accommodates: a faculty member who is also a researcher, a student who is also staff (work-study), etc.

Discriminator Constraints:

The discriminator should have appropriate constraints:

-- For disjoint subtypes (exclusive)
account_type NOT NULL CHECK (account_type IN ('CHECKING', 'SAVINGS', 'CD'))

-- For overlapping subtypes
CHECK (is_student OR is_faculty OR is_staff)  -- total participation
-- no check needed for which combinations allowed (overlapping)

Discriminator Location

Final Attribute Assignment

With the supertype named, keyed, and discriminated, we now formally assign attributes to the supertype and subtypes. This step crystallizes the decisions made during common attribute analysis.

Attribute Assignment Rules

•Supertype Receives Common Attributes — All attributes shared by all (or nearly all) subtypes move to the supertype. These are inherited by every subtype.
•Subtypes Retain Specific Attributes — Attributes unique to a subtype remain in that subtype's definition. These are the 'local' attributes.
•Remove Duplicates from Subtypes — After moving common attributes to supertype, remove them from subtype definitions. They're now inherited.
•Handle Edge Cases with Nullability — Attributes in most-but-not-all subtypes can move to supertype as nullable; those subtypes use NULL.
•Preserve Attribute Metadata — Constraints, defaults, and documentation should accompany attributes to their new locations.

attribute-assignment-example.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
-- Before Generalization: Separate entity definitions
 
-- FULL_TIME_EMPLOYEE
-- emp_id, first_name, last_name, email, hire_date, department,
-- annual_salary, bonus_percent, vacation_days, stock_options
 
-- PART_TIME_EMPLOYEE  
-- emp_id, first_name, last_name, email, hire_date, department,
-- hourly_rate, max_hours_per_week
 
-- CONTRACTOR
-- emp_id, first_name, last_name, email, start_date, department,
-- contract_rate, contract_end_date, agency_name
 
-- After Generalization: Attribute assignment
 
-- EMPLOYEE (supertype)
CREATE TABLE employee (
    emp_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    employee_type VARCHAR(20) NOT NULL CHECK (employee_type IN ('FULL_TIME', 'PART_TIME', 'CONTRACTOR')),
    
    -- Common attributes (inherited by all)
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(255) NOT NULL UNIQUE,
    start_date DATE NOT NULL,        -- unified from hire_date/start_date
    department VARCHAR(100),
    
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
 
-- FULL_TIME_EMPLOYEE (subtype) - specific attributes only
CREATE TABLE full_time_employee (
    emp_id UUID PRIMARY KEY REFERENCES employee(emp_id) ON DELETE CASCADE,
    annual_salary DECIMAL(12,2) NOT NULL,
    bonus_percent DECIMAL(5,2) DEFAULT 0,
    vacation_days INT DEFAULT 15,
    stock_options INT DEFAULT 0
);
 
-- PART_TIME_EMPLOYEE (subtype) - specific attributes only
CREATE TABLE part_time_employee (
    emp_id UUID PRIMARY KEY REFERENCES employee(emp_id) ON DELETE CASCADE,
    hourly_rate DECIMAL(8,2) NOT NULL,
    max_hours_per_week INT DEFAULT 20
);
 
-- CONTRACTOR (subtype) - specific attributes only  
CREATE TABLE contractor (
    emp_id UUID PRIMARY KEY REFERENCES employee(emp_id) ON DELETE CASCADE,
    contract_rate DECIMAL(10,2) NOT NULL,
    contract_end_date DATE,
    agency_name VARCHAR(200)
);

Naming Unification

Relationship Restructuring

Generalization affects not only attributes but also relationships. Common relationships must be migrated to the supertype, while subtype-specific relationships remain with their subtypes.

Relationship Migration Principles:

Principle 1: Move Common Relationships to Supertype

If all subtypes participate in the same relationship to another entity, that relationship should connect to the supertype:

Before:
  FULL_TIME ──(works_in)──→ DEPARTMENT
  PART_TIME ──(works_in)──→ DEPARTMENT
  CONTRACTOR ──(works_in)──→ DEPARTMENT

After:
  EMPLOYEE ──(works_in)──→ DEPARTMENT

Principle 2: Keep Subtype-Specific Relationships Local

Relationships that only apply to certain subtypes remain with those subtypes:

  CONTRACTOR ──(contracted_through)──→ AGENCY
  -- Only contractors have agencies; keep at subtype level

Principle 3: Foreign Key References Should Target Supertype

Other entities that previously referenced multiple subtype tables now reference the single supertype:

-- Before: Three FKs (or UNION queries)
CREATE TABLE project_assignment (
    assignment_id UUID PRIMARY KEY,
    worker_type VARCHAR(20),
    full_time_emp_id UUID REFERENCES full_time_employee,
    part_time_emp_id UUID REFERENCES part_time_employee,
    contractor_id UUID REFERENCES contractor,
    ...
);

-- After: Single FK to supertype
CREATE TABLE project_assignment (
    assignment_id UUID PRIMARY KEY,
    employee_id UUID NOT NULL REFERENCES employee(emp_id),  -- any employee type
    ...
);

Converting Mermaid diagram...

Relationship Semantics May Differ

Supertype Constraint Definition

Constraints on the supertype establish baseline rules that apply to all subtypes. Additional constraints can be added at the subtype level for more specific requirements.

Supertype Constraint Categories

•Discriminator Validity — Ensure the type discriminator contains only valid values: CHECK (employee_type IN ('FULL_TIME', 'PART_TIME', 'CONTRACTOR'))
•Universal NOT NULLs — Attributes required in ALL subtypes should be NOT NULL in supertype: name VARCHAR(100) NOT NULL
•Global Uniqueness — Attributes unique across the entire hierarchy: email VARCHAR(255) UNIQUE (no duplicate emails regardless of employee type)
•Domain Constraints — Value range checks applicable to all: CHECK (start_date <= CURRENT_DATE) (can't start in future)
•Referential Integrity — Foreign keys valid for all subtypes: department_id REFERENCES department(id) (all employees belong to departments)

supertype-constraints.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
-- Comprehensive supertype constraint definition
 
CREATE TABLE employee (
    emp_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    
    -- Discriminator with valid values
    employee_type VARCHAR(20) NOT NULL 
        CHECK (employee_type IN ('FULL_TIME', 'PART_TIME', 'CONTRACTOR')),
    
    -- Universal requirements (all employees have these)
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(255) NOT NULL,
    start_date DATE NOT NULL,
    
    -- Optional for some subtypes (nullable in supertype)
    department_id UUID,
    manager_id UUID,
    
    -- Global uniqueness
    CONSTRAINT unique_employee_email UNIQUE (email),
    
    -- Domain constraints applying to all
    CONSTRAINT valid_start_date CHECK (start_date <= CURRENT_DATE),
    CONSTRAINT valid_email_format CHECK (email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'),
    
    -- Referential integrity (self-referential)
    CONSTRAINT fk_manager FOREIGN KEY (manager_id) 
        REFERENCES employee(emp_id) ON DELETE SET NULL,
    
    -- Referential integrity (to other entities)
    CONSTRAINT fk_department FOREIGN KEY (department_id)
        REFERENCES department(dept_id) ON DELETE SET NULL,
    
    -- Audit columns
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
 
-- Subtype-specific constraint (only full-time must have department)
CREATE TABLE full_time_employee (
    emp_id UUID PRIMARY KEY REFERENCES employee(emp_id) ON DELETE CASCADE,
    annual_salary DECIMAL(12,2) NOT NULL,
    -- Subtype-specific: full-time must have department
    -- This is enforced via trigger or application since department is in supertype
    CONSTRAINT valid_salary CHECK (annual_salary > 0)
);
 
-- Create trigger to enforce full-time department requirement
CREATE OR REPLACE FUNCTION enforce_fulltime_department()
RETURNS TRIGGER AS $$
BEGIN
    IF NOT EXISTS (
        SELECT 1 FROM employee 
        WHERE emp_id = NEW.emp_id AND department_id IS NOT NULL
    ) THEN
        RAISE EXCEPTION 'Full-time employees must have a department';
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
 
CREATE TRIGGER trg_fulltime_department
BEFORE INSERT OR UPDATE ON full_time_employee
FOR EACH ROW EXECUTE FUNCTION enforce_fulltime_department();

Cross-Table Constraints

Schema Integration

Integration Tasks

•Update Foreign Key References — All tables that referenced individual subtypes should now reference the supertype. Create migration scripts to update FK constraints.
•Create Compatibility Views — For backward compatibility, create views that replicate the original subtype tables by joining supertype and subtype-specific tables.
•Update Application Code — Applications must be updated to work with the new hierarchy. This may involve query changes, ORM mapping updates, and API modifications.
•Migrate Existing Data — Transform data from old separate tables into the new hierarchical structure. Assign surrogate keys if needed, set discriminator values.
•Update Indexes — Create appropriate indexes on the supertype for common query patterns. Consider covering indexes for frequent subtype-filtered queries.
•Revise Security Policies — Row-level security, grants, and access controls may need updating to accommodate the new structure.

schema-integration-migration.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
-- Schema Integration Migration Script
 
-- Step 1: Create the new supertype structure (shown previously)
-- ... employee, full_time_employee, part_time_employee, contractor ...
 
-- Step 2: Migrate data from legacy tables
INSERT INTO employee (emp_id, employee_type, first_name, last_name, email, start_date, department_id)
SELECT 
    gen_random_uuid(),
    'FULL_TIME',
    first_name,
    last_name,
    email,
    hire_date,
    department_id
FROM legacy_full_time_employee;
 
-- Similarly for other subtypes...
 
-- Step 3: Populate subtype tables with specific attributes
INSERT INTO full_time_employee (emp_id, annual_salary, bonus_percent, vacation_days)
SELECT 
    e.emp_id,
    l.annual_salary,
    l.bonus_percent,
    l.vacation_days
FROM employee e
JOIN legacy_full_time_employee l ON e.email = l.email
WHERE e.employee_type = 'FULL_TIME';
 
-- Step 4: Create backward-compatible views
CREATE VIEW v_full_time_employee AS
SELECT 
    e.emp_id,
    e.first_name,
    e.last_name,
    e.email,
    e.start_date AS hire_date,  -- original name
    e.department_id,
    ft.annual_salary,
    ft.bonus_percent,
    ft.vacation_days,
    ft.stock_options
FROM employee e
JOIN full_time_employee ft ON e.emp_id = ft.emp_id
WHERE e.employee_type = 'FULL_TIME';
 
-- Step 5: Update foreign keys in dependent tables
-- Example: project_assignment previously had worker_type + multiple FKs
ALTER TABLE project_assignment 
    DROP COLUMN worker_type,
    DROP COLUMN full_time_emp_id,
    DROP COLUMN part_time_emp_id,
    DROP COLUMN contractor_id;
 
ALTER TABLE project_assignment
    ADD COLUMN employee_id UUID REFERENCES employee(emp_id);
 
-- Migrate assignment data...
 
-- Step 6: Create useful indexes
CREATE INDEX idx_employee_type ON employee(employee_type);
CREATE INDEX idx_employee_department ON employee(department_id); 
CREATE INDEX idx_employee_email_lower ON employee(LOWER(email));

Summary: Supertype Creation

We've covered the complete process of supertype creation—from naming through schema integration. Let's consolidate the essential practices:

Key Takeaways

•Name supertypes using natural domain concepts — The name should be a term domain experts use naturally. If you struggle to name it, the generalization may not be valid.
•Design primary keys to handle legacy data — Address key collisions between former subtypes. Options include surrogate keys, composite keys, or prefixed keys.
•Include a type discriminator — The discriminator identifies which subtype each instance belongs to. Choose a format that balances readability with efficiency.
•Assign attributes correctly — Common attributes go to supertype; specific attributes stay in subtypes. Remove duplicates from subtypes after migration.
•Restructure relationships appropriately — Common relationships connect to supertype; subtype-specific relationships remain local. Other entities should reference the supertype.
•Define constraints at appropriate levels — Universal constraints in supertype; specific constraints in subtypes or via triggers.
•Integrate carefully with existing schema — Update foreign keys, create compatibility views, migrate data, and update applications systematically.

What's Next:

Page Complete

4 / 5