Database Management SystemsSpecialization Mapping

Specialization Mapping: ER to Relational

LevelIntermediate

Duration60 mins

TopicSpecialization Mapping

2 / 5

Single Table Approach: Unified Hierarchy Storage

The Single Table Inheritance Pattern

Single Table Inheritance (STI), also known as Table-Per-Hierarchy (TPH), is the most radical approach to mapping specialization hierarchies. The core idea is deceptively simple: collapse the entire hierarchy into one table.

This means that a supertype and all its subtypes—no matter how many—share a single relational table. All attributes from every entity type in the hierarchy become columns in this unified table. A special discriminator column identifies which subtype each row represents.

The Central Trade-off:

STI trades data integrity and storage efficiency for query simplicity and performance. It's the approach where you pay for simplicity with NULL values and constraint complexity.

Let's explore when this trade-off makes sense, and how to implement STI correctly when you choose it.

What You Will Learn

By the end of this page, you will master STI implementation including discriminator patterns, NULL handling strategies, CHECK constraint design, schema evolution approaches, and performance optimization techniques. You'll understand exactly when STI is the right choice.

Core Implementation

The fundamental STI implementation follows a predictable pattern. Let's build a complete example using our university Person hierarchy:

Step 1: Combine All Attributes

Start by listing every attribute from every entity type in the hierarchy:

From Person: person_id, name, email, date_of_birth, address
From Student: student_number, enrollment_date, gpa, major, credits_earned
From Faculty: faculty_id, rank, tenure_status, specialty, office_number
From Staff: staff_id, position, department, hire_date, salary_grade

Step 2: Add Discriminator Column

The discriminator column is the key to STI. It tells the system (and queries) which subtype each row represents.

sti_complete_implementation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
-- Single Table Inheritance: Complete Implementation
CREATE TABLE person (
    -- Primary Key (supertype identifier)
    person_id       SERIAL PRIMARY KEY,
    
    -- Discriminator Column
    -- This is the critical element that identifies the subtype
    person_type     VARCHAR(20) NOT NULL,
    
    -- Supertype Attributes (common to all)
    -- These are NEVER NULL (shared by all entities)
    name            VARCHAR(100) NOT NULL,
    email           VARCHAR(100) NOT NULL UNIQUE,
    date_of_birth   DATE NOT NULL,
    address         TEXT,
    created_at      TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at      TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    -- ═══════════════════════════════════════════════════════
    -- STUDENT-SPECIFIC ATTRIBUTES
    -- NULL for non-students; populated only when person_type = 'STUDENT'
    -- ═══════════════════════════════════════════════════════
    student_number  VARCHAR(20) UNIQUE,      -- NULL if not student
    enrollment_date DATE,                     -- NULL if not student
    gpa             DECIMAL(3,2),             -- NULL if not student
    major           VARCHAR(50),              -- NULL if not student
    credits_earned  INTEGER DEFAULT 0,        -- NULL if not student
    
    -- ═══════════════════════════════════════════════════════
    -- FACULTY-SPECIFIC ATTRIBUTES
    -- NULL for non-faculty; populated only when person_type = 'FACULTY'
    -- ═══════════════════════════════════════════════════════
    faculty_id      VARCHAR(20) UNIQUE,       -- NULL if not faculty
    rank            VARCHAR(30),              -- NULL if not faculty
    tenure_status   VARCHAR(20),              -- NULL if not faculty
    specialty       VARCHAR(100),             -- NULL if not faculty
    office_number   VARCHAR(15),              -- NULL if not faculty
    
    -- ═══════════════════════════════════════════════════════
    -- STAFF-SPECIFIC ATTRIBUTES  
    -- NULL for non-staff; populated only when person_type = 'STAFF'
    -- ═══════════════════════════════════════════════════════
    staff_id        VARCHAR(20) UNIQUE,       -- NULL if not staff
    position        VARCHAR(50),              -- NULL if not staff
    department      VARCHAR(50),              -- NULL if not staff
    hire_date       DATE,                     -- NULL if not staff
    salary_grade    VARCHAR(10),              -- NULL if not staff
    
    -- ═══════════════════════════════════════════════════════
    -- DISCRIMINATOR CONSTRAINT
    -- Ensures only valid subtype values
    -- ═══════════════════════════════════════════════════════
    CONSTRAINT chk_valid_person_type 
        CHECK (person_type IN ('STUDENT', 'FACULTY', 'STAFF'))
);
 
-- Create index on discriminator for fast subtype queries
CREATE INDEX idx_person_type ON person(person_type);
 
-- Create composite indexes for common query patterns
CREATE INDEX idx_student_major ON person(major) WHERE person_type = 'STUDENT';
CREATE INDEX idx_faculty_rank ON person(rank) WHERE person_type = 'FACULTY';
CREATE INDEX idx_staff_department ON person(department) WHERE person_type = 'STAFF';

Column Organization

Organize columns by subtype in the schema definition for maintainability. Comment clearly which attributes belong to which subtype. Some DBAs use naming conventions like 'STUDENT__gpa' to make subtype ownership explicit, though this adds verbosity.

Discriminator Patterns and Strategies

The discriminator column is the linchpin of STI. Its design significantly impacts query performance, data integrity, and application integration. Let's examine the various discriminator strategies:

Pattern 1: String Discriminator (Most Common)

Use a VARCHAR column with human-readable type names. This is the most flexible and debuggable approach.

discriminator_string.sql
-- String discriminator: Readable and flexible
person_type VARCHAR(20) NOT NULL
    CHECK (person_type IN ('STUDENT', 'FACULTY', 'STAFF', 'ALUMNI'))
 
-- Pros:
--   ✓ Human-readable in query results
--   ✓ Easy to add new types (just update CHECK)
--   ✓ Self-documenting in data exports
--   ✓ Works with all databases
 
-- Cons:
--   ✗ Larger storage per row (vs. integer)
--   ✗ Case sensitivity issues possible
--   ✗ Typo risk in application code

Pattern 2: Integer/Enum Discriminator

Use integers or database-native enums for compact storage and faster comparisons.

discriminator_integer.sql
-- Integer discriminator: Compact and fast
person_type SMALLINT NOT NULL
    CHECK (person_type IN (1, 2, 3, 4))
 
-- With lookup table for meaning:
CREATE TABLE person_type_lookup (
    type_id   SMALLINT PRIMARY KEY,
    type_name VARCHAR(20) NOT NULL UNIQUE
);
INSERT INTO person_type_lookup VALUES
    (1, 'STUDENT'), (2, 'FACULTY'), (3, 'STAFF'), (4, 'ALUMNI');
 
-- PostgreSQL ENUM: Best of both worlds
CREATE TYPE person_type_enum AS ENUM ('STUDENT', 'FACULTY', 'STAFF');
-- Then in table:
person_type person_type_enum NOT NULL
 
-- Pros:
--   ✓ Minimal storage (1-2 bytes vs. 10-20 for strings)
--   ✓ Faster comparison operations
--   ✓ Database-enforced valid values (for ENUM)
 
-- Cons:
--   ✗ Requires lookup to understand data
--   ✗ ENUM modification can be complex (ALTER TYPE)
--   ✗ Integer types not self-documenting

Pattern 3: Class Name Discriminator (ORM-Friendly)

Store the full class/entity name for direct ORM mapping. Common with Hibernate, Entity Framework, and similar ORMs.

discriminator_classname.sql
-- Class name discriminator: Direct ORM mapping
entity_type VARCHAR(255) NOT NULL
    -- Values like: 'com.university.model.Student'
    --              'com.university.model.Faculty'
 
-- Pros:
--   ✓ Seamless ORM integration
--   ✓ Supports inheritance hierarchies in code
--   ✓ Type resolution automatic
 
-- Cons:
--   ✗ Large storage overhead
--   ✗ Tied to application package structure
--   ✗ Refactoring class names breaks data

Discriminator Pattern Comparison
Pattern	Storage	Performance	Readability	Flexibility	ORM Support
String	Medium (10-20 bytes)	Good	★★★★★ Excellent	★★★★★ Excellent	★★★★☆ Good
Integer	Small (1-4 bytes)	Best	★★☆☆☆ Poor	★★★★☆ Good	★★★★☆ Good
Enum	Small (1-4 bytes)	Best	★★★★☆ Good	★★☆☆☆ Limited	★★★★★ Excellent
Class Name	Large (50-255 bytes)	Fair	★★★☆☆ Fair	★★★☆☆ Fair	★★★★★ Excellent

Recommendation

For most applications, use a string discriminator with an explicit CHECK constraint. It balances readability, flexibility, and performance. Use PostgreSQL ENUM when you need the storage efficiency and your type set is stable. Avoid class name discriminators unless your ORM specifically requires them.

NULL Handling Strategies

NULLs are the defining characteristic—and the primary weakness—of STI. In a properly designed STI table, every row has NULL values for all attributes belonging to other subtypes. This creates several challenges that must be addressed systematically.

The NULL Problem Illustrated:

For a Person table with 3 subtypes (Student, Faculty, Staff), each having 5 unique attributes:

A Student row has 10 NULL values (5 Faculty + 5 Staff attributes)
A Faculty row has 10 NULL values (5 Student + 5 Staff attributes)
A Staff row has 10 NULL values (5 Student + 5 Faculty attributes)

As you add subtypes or subtype-specific attributes, NULL density grows quadratically.

Strategy 1: Attribute-Level CHECK Constraints

Enforce that subtype-specific attributes are only populated for the correct subtype:

null_check_constraints.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Enforce correct NULL patterns with CHECK constraints
 
-- Students MUST have student_number; non-students MUST NOT
ALTER TABLE person ADD CONSTRAINT chk_student_attrs CHECK (
    (person_type = 'STUDENT' AND student_number IS NOT NULL 
                              AND enrollment_date IS NOT NULL)
    OR
    (person_type != 'STUDENT' AND student_number IS NULL 
                               AND enrollment_date IS NULL
                               AND gpa IS NULL
                               AND major IS NULL
                               AND credits_earned IS NULL)
);
 
-- Faculty validation
ALTER TABLE person ADD CONSTRAINT chk_faculty_attrs CHECK (
    (person_type = 'FACULTY' AND faculty_id IS NOT NULL 
                              AND rank IS NOT NULL)
    OR
    (person_type != 'FACULTY' AND faculty_id IS NULL 
                               AND rank IS NULL
                               AND tenure_status IS NULL
                               AND specialty IS NULL
                               AND office_number IS NULL)
);
 
-- Staff validation
ALTER TABLE person ADD CONSTRAINT chk_staff_attrs CHECK (
    (person_type = 'STAFF' AND staff_id IS NOT NULL 
                            AND position IS NOT NULL)
    OR
    (person_type != 'STAFF' AND staff_id IS NULL 
                             AND position IS NULL
                             AND department IS NULL
                             AND hire_date IS NULL
                             AND salary_grade IS NULL)
);

Strategy 2: Default Values for Required Fields

Some systems use sentinel values instead of NULLs to avoid NULL-related query complications:

sentinel_values.sql
-- Using sentinel values (USE WITH CAUTION)
-- Instead of NULL, use type-specific defaults
 
-- For numeric fields
gpa DECIMAL(3,2) DEFAULT -1.00,  -- -1 indicates "not applicable"
credits_earned INTEGER DEFAULT -1,
 
-- For string fields  
major VARCHAR(50) DEFAULT 'N/A',
department VARCHAR(50) DEFAULT 'N/A',
 
-- Querying becomes:
SELECT * FROM person 
WHERE person_type = 'STUDENT' AND gpa != -1.00;
 
-- WARNING: This approach has significant drawbacks:
--   ✗ Pollutes data with magic values
--   ✗ Aggregations require special handling
--   ✗ "N/A" might be a legitimate value someday
--   ✗ Different "not applicable" values per type confuse
 
-- RECOMMENDATION: Use actual NULLs with proper constraints

Strategy 3: View-Based Abstraction

Create views that present clean subtype-specific interfaces, hiding the NULL complexity:

subtype_views.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
-- Create clean views for each subtype
CREATE VIEW students AS
SELECT 
    person_id,
    name,
    email,
    date_of_birth,
    address,
    student_number,
    enrollment_date,
    gpa,
    major,
    credits_earned
FROM person
WHERE person_type = 'STUDENT';
 
CREATE VIEW faculty AS
SELECT 
    person_id,
    name,
    email,
    date_of_birth,
    address,
    faculty_id,
    rank,
    tenure_status,
    specialty,
    office_number
FROM person
WHERE person_type = 'FACULTY';
 
CREATE VIEW staff AS
SELECT 
    person_id,
    name,
    email,
    date_of_birth,
    address,
    staff_id,
    position,
    department,
    hire_date,
    salary_grade
FROM person
WHERE person_type = 'STAFF';
 
-- Now application code uses clean interfaces:
SELECT * FROM students WHERE gpa > 3.5;
SELECT * FROM faculty WHERE tenure_status = 'TENURED';
SELECT * FROM staff WHERE department = 'IT';
 
-- Views hide NULL columns completely - cleaner APIs

View-Based Best Practice

In production STI implementations, always create subtype-specific views. Applications should primarily interact through these views rather than the base table. This provides abstraction, simplifies queries, and allows schema evolution without breaking application code.

Advanced Constraint Enforcement

Beyond basic NULL validation, STI requires sophisticated constraint strategies to maintain data integrity. Let's examine advanced patterns:

Cross-Attribute Constraints:

Some business rules span multiple attributes within a subtype:

cross_attribute_constraints.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Business Rule: Students must have valid GPA (0.0-4.0) when enrolled
ALTER TABLE person ADD CONSTRAINT chk_student_gpa_range CHECK (
    person_type != 'STUDENT' 
    OR (gpa >= 0.0 AND gpa <= 4.0)
);
 
-- Business Rule: Tenured faculty must be Professor or Associate Professor
ALTER TABLE person ADD CONSTRAINT chk_tenure_rank CHECK (
    person_type != 'FACULTY'
    OR tenure_status != 'TENURED'
    OR rank IN ('PROFESSOR', 'ASSOCIATE_PROFESSOR')
);
 
-- Business Rule: Staff hire_date cannot be in the future
ALTER TABLE person ADD CONSTRAINT chk_staff_hire_date CHECK (
    person_type != 'STAFF'
    OR hire_date <= CURRENT_DATE
);
 
-- Business Rule: If student has credits >= 120, must have major declared
ALTER TABLE person ADD CONSTRAINT chk_student_major_credits CHECK (
    person_type != 'STUDENT'
    OR credits_earned < 120
    OR major IS NOT NULL
);

Unique Constraints Across Subtypes:

Handling uniqueness in STI requires special attention. Each subtype has its own "unique" identifier (student_number, faculty_id, staff_id), but the primary key (person_id) is shared.

unique_constraints.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Each subtype ID must be unique within its type
-- NULL values don't violate UNIQUE constraints, so this works:
ALTER TABLE person ADD CONSTRAINT uniq_student_number 
    UNIQUE (student_number);  -- NULLs excluded
    
ALTER TABLE person ADD CONSTRAINT uniq_faculty_id 
    UNIQUE (faculty_id);  -- NULLs excluded
    
ALTER TABLE person ADD CONSTRAINT uniq_staff_id 
    UNIQUE (staff_id);  -- NULLs excluded
 
-- For cross-hierarchy email uniqueness (already shared):
ALTER TABLE person ADD CONSTRAINT uniq_email UNIQUE (email);
 
-- Partial unique indexes for subtype-specific uniqueness rule:
-- Example: Student numbers unique only among active students
CREATE UNIQUE INDEX idx_active_student_unique 
    ON person (student_number) 
    WHERE person_type = 'STUDENT' AND credits_earned < 200;
 
-- Example: Only one department head per department
CREATE UNIQUE INDEX idx_one_dept_head
    ON person (department)
    WHERE person_type = 'STAFF' AND position = 'DEPARTMENT_HEAD';

Foreign Key Relationships:

Relationships involving subtypes require careful FK design:

foreign_key_relationships.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- Relationships that apply to ALL person types (to supertype)
CREATE TABLE emergency_contact (
    contact_id   SERIAL PRIMARY KEY,
    person_id    INTEGER NOT NULL REFERENCES person(person_id),
    contact_name VARCHAR(100) NOT NULL,
    phone        VARCHAR(20) NOT NULL
);
 
-- Relationships specific to STUDENTS (need validation)
CREATE TABLE enrollment (
    enrollment_id SERIAL PRIMARY KEY,
    person_id     INTEGER NOT NULL REFERENCES person(person_id),
    course_id     INTEGER NOT NULL REFERENCES course(course_id),
    semester      VARCHAR(20) NOT NULL,
    grade         VARCHAR(2),
    
    -- Ensure only students can enroll
    -- Option 1: Trigger-based enforcement
    -- Option 2: Application-level enforcement (common but weaker)
);
 
-- Trigger to enforce student-only enrollment
CREATE OR REPLACE FUNCTION check_student_enrollment()
RETURNS TRIGGER AS $$
BEGIN
    IF NOT EXISTS (
        SELECT 1 FROM person 
        WHERE person_id = NEW.person_id 
        AND person_type = 'STUDENT'
    ) THEN
        RAISE EXCEPTION 'Only students can be enrolled in courses';
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
 
CREATE TRIGGER trg_check_student_enrollment
BEFORE INSERT OR UPDATE ON enrollment
FOR EACH ROW EXECUTE FUNCTION check_student_enrollment();
 
-- Similarly for faculty-specific teaching assignments
CREATE TABLE course_assignment (
    assignment_id SERIAL PRIMARY KEY,
    person_id     INTEGER NOT NULL REFERENCES person(person_id),
    course_id     INTEGER NOT NULL REFERENCES course(course_id),
    semester      VARCHAR(20) NOT NULL
);
 
CREATE TRIGGER trg_check_faculty_assignment
BEFORE INSERT OR UPDATE ON course_assignment
FOR EACH ROW EXECUTE FUNCTION check_faculty_assignment();

STI Foreign Key Limitation

STI cannot use database-level foreign keys to restrict relationships to specific subtypes. You cannot create a FK that only references 'Student' rows. This requires triggers or application logic—a significant integrity trade-off compared to TPT.

Query Patterns and Performance

STI excels at certain query patterns while struggling with others. Understanding these patterns is crucial for choosing STI appropriately.

Pattern 1: Hierarchy-Wide Queries (STI Strength)

Queries that need data from all subtypes are where STI truly shines:

hierarchy_queries.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Find all people with email ending in @university.edu
-- STI: Simple and fast - single table scan
SELECT person_id, name, email, person_type
FROM person
WHERE email LIKE '%@university.edu';
 
-- Count all people by type
-- STI: Efficient GROUP BY on single table
SELECT person_type, COUNT(*) as count
FROM person
GROUP BY person_type;
 
-- Find all people born in 1990s
SELECT name, date_of_birth, person_type
FROM person
WHERE date_of_birth BETWEEN '1990-01-01' AND '1999-12-31'
ORDER BY date_of_birth;
 
-- Search people by name (regardless of type)
SELECT person_id, name, person_type
FROM person
WHERE name ILIKE '%smith%';
 
-- These queries need NO JOINS - STI's primary advantage

Pattern 2: Subtype-Specific Queries

Queries targeting a single subtype work well with proper indexing:

subtype_queries.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Find students with GPA > 3.5
-- Uses partial index: idx_student_major (major) WHERE person_type = 'STUDENT'
SELECT person_id, name, gpa, major
FROM person
WHERE person_type = 'STUDENT' 
  AND gpa > 3.5
ORDER BY gpa DESC;
 
-- Find tenured faculty in Computer Science
SELECT name, rank, office_number
FROM person
WHERE person_type = 'FACULTY'
  AND tenure_status = 'TENURED'
  AND specialty LIKE '%Computer Science%';
 
-- Find staff in IT department
SELECT name, position, hire_date
FROM person
WHERE person_type = 'STAFF'
  AND department = 'IT';
 
-- IMPORTANT: Always filter by person_type first for optimizer hints
-- This enables partial index usage and reduces scan scope

Pattern 3: Cross-Subtype Aggregations

cross_subtype_aggregations.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Compare average metrics across types (where applicable)
SELECT 
    person_type,
    COUNT(*) as total_count,
    -- Type-specific aggregations require CASE
    AVG(CASE WHEN person_type = 'STUDENT' THEN gpa END) as avg_student_gpa,
    AVG(CASE 
        WHEN person_type = 'STUDENT' THEN credits_earned 
    END) as avg_student_credits,
    COUNT(CASE 
        WHEN person_type = 'FACULTY' AND tenure_status = 'TENURED' 
        THEN 1 
    END) as tenured_faculty_count
FROM person
GROUP BY person_type;
 
-- Find people added in last 30 days by type
SELECT 
    person_type,
    COUNT(*) as new_additions,
    MIN(created_at) as first_added,
    MAX(created_at) as last_added
FROM person
WHERE created_at > CURRENT_DATE - INTERVAL '30 days'
GROUP BY person_type;

Performance Optimization Strategies:

STI Performance Checklist

•Index the discriminator: CREATE INDEX idx_person_type ON person(person_type);
•Use partial indexes: Create indexes scoped to specific person_types for subtype-heavy queries
•Composite indexes: Include discriminator in composite indexes: (person_type, gpa) not just (gpa)
•Table statistics: Ensure ANALYZE runs frequently so optimizer knows type distribution
•Consider table partitioning: For very large tables, partition by person_type (explored in advanced scenarios)

Storage Analysis and Overhead

Understanding the storage implications of STI is crucial for capacity planning and cost analysis. Let's quantify the NULL overhead:

Storage Calculation Example:

Consider our Person table with:

6 shared attributes (~100 bytes)
5 Student attributes (~50 bytes when populated)
5 Faculty attributes (~50 bytes when populated)
5 Staff attributes (~50 bytes when populated)
1 discriminator (~15 bytes)

Per-Row Storage (theoretical):

Storage Analysis: STI vs TPT
Subtype	STI Row Size	Actual Data	NULL Overhead	TPT Equivalent
Student	~165 bytes*	~150 bytes	~15 bytes (NULLs)	~65 bytes (Student) + ~100 bytes (Person) = ~165 bytes
Faculty	~165 bytes*	~150 bytes	~15 bytes (NULLs)	~65 bytes (Faculty) + ~100 bytes (Person) = ~165 bytes
Staff	~165 bytes*	~150 bytes	~15 bytes (NULLs)	~65 bytes (Staff) + ~100 bytes (Person) = ~165 bytes

The NULL Storage Reality:

NULL storage varies by database:

PostgreSQL: NULL bitmap - 1 bit per nullable column (efficient for many NULLs)
MySQL/InnoDB: 1 byte per 8 nullable columns in NULL bitmap
SQL Server: 2-byte NULL bitmap + storage for variable-length NULL indicators

For our example with 15 nullable columns, NULL metadata costs approximately 2-4 bytes per row—much less than you might expect.

storage_analysis.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Analyze actual storage in PostgreSQL
SELECT 
    pg_size_pretty(pg_total_relation_size('person')) as total_size,
    pg_size_pretty(pg_relation_size('person')) as table_size,
    pg_size_pretty(pg_indexes_size('person')) as indexes_size,
    (SELECT COUNT(*) FROM person) as row_count,
    pg_size_pretty(
        pg_total_relation_size('person') / 
        NULLIF((SELECT COUNT(*) FROM person), 0)
    ) as avg_row_size;
 
-- Compare NULL density by type
SELECT 
    person_type,
    COUNT(*) as rows,
    -- Count NULL values per type
    SUM(CASE WHEN student_number IS NULL THEN 1 ELSE 0 END) as null_student,
    SUM(CASE WHEN faculty_id IS NULL THEN 1 ELSE 0 END) as null_faculty,
    SUM(CASE WHEN staff_id IS NULL THEN 1 ELSE 0 END) as null_staff
FROM person
GROUP BY person_type;

Storage Surprise

STI's storage overhead from NULLs is often less than expected. The real storage cost is the additional columns themselves (schema width), not the NULL values. For hierarchies with few subtype-specific attributes, STI storage can match or beat TPT due to eliminated key duplication.

When to Choose Single Table Inheritance

STI is the right choice under specific conditions. Let's crystallize the decision criteria:

Choose STI When

•Subtypes have few unique attributes (< 5 each)
•Most queries span the hierarchy (70%+)
•Query simplicity is paramount (no JOINs)
•Subtypes are disjoint (no overlap)
•Read performance dominates requirements
•The hierarchy has few subtypes (2-4)
•ORM expects STI (Rails, some Java ORMs)
•Schema changes are infrequent

Avoid STI When

•Subtypes have many unique attributes (10+)
•Subtype-specific queries dominate
•Data integrity is paramount
•Subtypes overlap (multi-role entities)
•Subtype relationships need FK enforcement
•The hierarchy has many subtypes (5+)
•Schema evolution happens frequently
•Storage efficiency is critical

Real-World STI Success Stories:

•E-commerce Products: PhysicalProduct vs DigitalProduct with mostly shared attributes (name, price, description) and few unique ones (weight vs download_url)
•Content Management: Article, BlogPost, NewsItem all sharing title, body, author, date with minimal unique fields
•User Notifications: EmailNotification, PushNotification, SMSNotification sharing recipient, message, timestamp
•Vehicle Fleet: Car, Truck, Van sharing make, model, year, mileage with few type-specific attributes

STI Anti-Patterns

Avoid STI when you find yourself: (1) Adding dozens of columns for new subtypes, (2) Writing complex triggers to enforce subtype constraints, (3) Creating relationships that logically only apply to certain subtypes, (4) Frequently adding new subtypes to the hierarchy. These are signals that TPT would be more appropriate.

Summary: Mastering Single Table Inheritance

We've comprehensively explored Single Table Inheritance—its implementation, patterns, trade-offs, and appropriate use cases.

Key Takeaways

•Core Concept: STI collapses entire hierarchies into single tables with discriminator columns identifying subtypes.
•NULL Handling: Essential strategies include CHECK constraints, subtype views, and careful indexing.
•Constraint Design: Requires explicit CHECK constraints to enforce subtype-specific business rules.
•Query Advantage: Hierarchy-wide queries are simple and fast—no JOINs required.
•Storage Trade-off: NULL overhead is often smaller than expected; schema width is the real cost.
•Selection Criteria: Best for small hierarchies with shared attributes and hierarchy-spanning queries.
•Integrity Limitation: Cannot use FK constraints for subtype-specific relationships.

What's Next:

The next page explores the Multiple Table Approach (TPT)—where each entity type gets its own table. We'll see how this trades query complexity for data integrity and flexibility, and when that trade-off makes sense.

Page Complete

You now have comprehensive knowledge of Single Table Inheritance including implementation, discriminator patterns, NULL handling, constraint enforcement, and selection criteria. You can confidently implement STI when it's the right choice and recognize when alternatives are more appropriate.

2 / 5

Loading learning content...

Database Management SystemsSpecialization Mapping

Specialization Mapping: ER to Relational

LevelIntermediate

Duration60 mins

TopicSpecialization Mapping

2 / 5

Single Table Approach: Unified Hierarchy Storage

The Single Table Inheritance Pattern

The Central Trade-off:

STI trades data integrity and storage efficiency for query simplicity and performance. It's the approach where you pay for simplicity with NULL values and constraint complexity.

Let's explore when this trade-off makes sense, and how to implement STI correctly when you choose it.

What You Will Learn

Core Implementation

The fundamental STI implementation follows a predictable pattern. Let's build a complete example using our university Person hierarchy:

Step 1: Combine All Attributes

Start by listing every attribute from every entity type in the hierarchy:

From Person: person_id, name, email, date_of_birth, address
From Student: student_number, enrollment_date, gpa, major, credits_earned
From Faculty: faculty_id, rank, tenure_status, specialty, office_number
From Staff: staff_id, position, department, hire_date, salary_grade

Step 2: Add Discriminator Column

The discriminator column is the key to STI. It tells the system (and queries) which subtype each row represents.

sti_complete_implementation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
-- Single Table Inheritance: Complete Implementation
CREATE TABLE person (
    -- Primary Key (supertype identifier)
    person_id       SERIAL PRIMARY KEY,
    
    -- Discriminator Column
    -- This is the critical element that identifies the subtype
    person_type     VARCHAR(20) NOT NULL,
    
    -- Supertype Attributes (common to all)
    -- These are NEVER NULL (shared by all entities)
    name            VARCHAR(100) NOT NULL,
    email           VARCHAR(100) NOT NULL UNIQUE,
    date_of_birth   DATE NOT NULL,
    address         TEXT,
    created_at      TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at      TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    -- ═══════════════════════════════════════════════════════
    -- STUDENT-SPECIFIC ATTRIBUTES
    -- NULL for non-students; populated only when person_type = 'STUDENT'
    -- ═══════════════════════════════════════════════════════
    student_number  VARCHAR(20) UNIQUE,      -- NULL if not student
    enrollment_date DATE,                     -- NULL if not student
    gpa             DECIMAL(3,2),             -- NULL if not student
    major           VARCHAR(50),              -- NULL if not student
    credits_earned  INTEGER DEFAULT 0,        -- NULL if not student
    
    -- ═══════════════════════════════════════════════════════
    -- FACULTY-SPECIFIC ATTRIBUTES
    -- NULL for non-faculty; populated only when person_type = 'FACULTY'
    -- ═══════════════════════════════════════════════════════
    faculty_id      VARCHAR(20) UNIQUE,       -- NULL if not faculty
    rank            VARCHAR(30),              -- NULL if not faculty
    tenure_status   VARCHAR(20),              -- NULL if not faculty
    specialty       VARCHAR(100),             -- NULL if not faculty
    office_number   VARCHAR(15),              -- NULL if not faculty
    
    -- ═══════════════════════════════════════════════════════
    -- STAFF-SPECIFIC ATTRIBUTES  
    -- NULL for non-staff; populated only when person_type = 'STAFF'
    -- ═══════════════════════════════════════════════════════
    staff_id        VARCHAR(20) UNIQUE,       -- NULL if not staff
    position        VARCHAR(50),              -- NULL if not staff
    department      VARCHAR(50),              -- NULL if not staff
    hire_date       DATE,                     -- NULL if not staff
    salary_grade    VARCHAR(10),              -- NULL if not staff
    
    -- ═══════════════════════════════════════════════════════
    -- DISCRIMINATOR CONSTRAINT
    -- Ensures only valid subtype values
    -- ═══════════════════════════════════════════════════════
    CONSTRAINT chk_valid_person_type 
        CHECK (person_type IN ('STUDENT', 'FACULTY', 'STAFF'))
);
 
-- Create index on discriminator for fast subtype queries
CREATE INDEX idx_person_type ON person(person_type);
 
-- Create composite indexes for common query patterns
CREATE INDEX idx_student_major ON person(major) WHERE person_type = 'STUDENT';
CREATE INDEX idx_faculty_rank ON person(rank) WHERE person_type = 'FACULTY';
CREATE INDEX idx_staff_department ON person(department) WHERE person_type = 'STAFF';

Column Organization

Discriminator Patterns and Strategies

The discriminator column is the linchpin of STI. Its design significantly impacts query performance, data integrity, and application integration. Let's examine the various discriminator strategies:

Pattern 1: String Discriminator (Most Common)

Use a VARCHAR column with human-readable type names. This is the most flexible and debuggable approach.

discriminator_string.sql
-- String discriminator: Readable and flexible
person_type VARCHAR(20) NOT NULL
    CHECK (person_type IN ('STUDENT', 'FACULTY', 'STAFF', 'ALUMNI'))
 
-- Pros:
--   ✓ Human-readable in query results
--   ✓ Easy to add new types (just update CHECK)
--   ✓ Self-documenting in data exports
--   ✓ Works with all databases
 
-- Cons:
--   ✗ Larger storage per row (vs. integer)
--   ✗ Case sensitivity issues possible
--   ✗ Typo risk in application code

Pattern 2: Integer/Enum Discriminator

Use integers or database-native enums for compact storage and faster comparisons.

discriminator_integer.sql
-- Integer discriminator: Compact and fast
person_type SMALLINT NOT NULL
    CHECK (person_type IN (1, 2, 3, 4))
 
-- With lookup table for meaning:
CREATE TABLE person_type_lookup (
    type_id   SMALLINT PRIMARY KEY,
    type_name VARCHAR(20) NOT NULL UNIQUE
);
INSERT INTO person_type_lookup VALUES
    (1, 'STUDENT'), (2, 'FACULTY'), (3, 'STAFF'), (4, 'ALUMNI');
 
-- PostgreSQL ENUM: Best of both worlds
CREATE TYPE person_type_enum AS ENUM ('STUDENT', 'FACULTY', 'STAFF');
-- Then in table:
person_type person_type_enum NOT NULL
 
-- Pros:
--   ✓ Minimal storage (1-2 bytes vs. 10-20 for strings)
--   ✓ Faster comparison operations
--   ✓ Database-enforced valid values (for ENUM)
 
-- Cons:
--   ✗ Requires lookup to understand data
--   ✗ ENUM modification can be complex (ALTER TYPE)
--   ✗ Integer types not self-documenting

Pattern 3: Class Name Discriminator (ORM-Friendly)

Store the full class/entity name for direct ORM mapping. Common with Hibernate, Entity Framework, and similar ORMs.

discriminator_classname.sql
-- Class name discriminator: Direct ORM mapping
entity_type VARCHAR(255) NOT NULL
    -- Values like: 'com.university.model.Student'
    --              'com.university.model.Faculty'
 
-- Pros:
--   ✓ Seamless ORM integration
--   ✓ Supports inheritance hierarchies in code
--   ✓ Type resolution automatic
 
-- Cons:
--   ✗ Large storage overhead
--   ✗ Tied to application package structure
--   ✗ Refactoring class names breaks data

Discriminator Pattern Comparison
Pattern	Storage	Performance	Readability	Flexibility	ORM Support
String	Medium (10-20 bytes)	Good	★★★★★ Excellent	★★★★★ Excellent	★★★★☆ Good
Integer	Small (1-4 bytes)	Best	★★☆☆☆ Poor	★★★★☆ Good	★★★★☆ Good
Enum	Small (1-4 bytes)	Best	★★★★☆ Good	★★☆☆☆ Limited	★★★★★ Excellent
Class Name	Large (50-255 bytes)	Fair	★★★☆☆ Fair	★★★☆☆ Fair	★★★★★ Excellent

Recommendation

NULL Handling Strategies

The NULL Problem Illustrated:

For a Person table with 3 subtypes (Student, Faculty, Staff), each having 5 unique attributes:

A Student row has 10 NULL values (5 Faculty + 5 Staff attributes)
A Faculty row has 10 NULL values (5 Student + 5 Staff attributes)
A Staff row has 10 NULL values (5 Student + 5 Faculty attributes)

As you add subtypes or subtype-specific attributes, NULL density grows quadratically.

Strategy 1: Attribute-Level CHECK Constraints

Enforce that subtype-specific attributes are only populated for the correct subtype:

null_check_constraints.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Enforce correct NULL patterns with CHECK constraints
 
-- Students MUST have student_number; non-students MUST NOT
ALTER TABLE person ADD CONSTRAINT chk_student_attrs CHECK (
    (person_type = 'STUDENT' AND student_number IS NOT NULL 
                              AND enrollment_date IS NOT NULL)
    OR
    (person_type != 'STUDENT' AND student_number IS NULL 
                               AND enrollment_date IS NULL
                               AND gpa IS NULL
                               AND major IS NULL
                               AND credits_earned IS NULL)
);
 
-- Faculty validation
ALTER TABLE person ADD CONSTRAINT chk_faculty_attrs CHECK (
    (person_type = 'FACULTY' AND faculty_id IS NOT NULL 
                              AND rank IS NOT NULL)
    OR
    (person_type != 'FACULTY' AND faculty_id IS NULL 
                               AND rank IS NULL
                               AND tenure_status IS NULL
                               AND specialty IS NULL
                               AND office_number IS NULL)
);
 
-- Staff validation
ALTER TABLE person ADD CONSTRAINT chk_staff_attrs CHECK (
    (person_type = 'STAFF' AND staff_id IS NOT NULL 
                            AND position IS NOT NULL)
    OR
    (person_type != 'STAFF' AND staff_id IS NULL 
                             AND position IS NULL
                             AND department IS NULL
                             AND hire_date IS NULL
                             AND salary_grade IS NULL)
);

Strategy 2: Default Values for Required Fields

Some systems use sentinel values instead of NULLs to avoid NULL-related query complications:

sentinel_values.sql
-- Using sentinel values (USE WITH CAUTION)
-- Instead of NULL, use type-specific defaults
 
-- For numeric fields
gpa DECIMAL(3,2) DEFAULT -1.00,  -- -1 indicates "not applicable"
credits_earned INTEGER DEFAULT -1,
 
-- For string fields  
major VARCHAR(50) DEFAULT 'N/A',
department VARCHAR(50) DEFAULT 'N/A',
 
-- Querying becomes:
SELECT * FROM person 
WHERE person_type = 'STUDENT' AND gpa != -1.00;
 
-- WARNING: This approach has significant drawbacks:
--   ✗ Pollutes data with magic values
--   ✗ Aggregations require special handling
--   ✗ "N/A" might be a legitimate value someday
--   ✗ Different "not applicable" values per type confuse
 
-- RECOMMENDATION: Use actual NULLs with proper constraints

Strategy 3: View-Based Abstraction

Create views that present clean subtype-specific interfaces, hiding the NULL complexity:

subtype_views.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
-- Create clean views for each subtype
CREATE VIEW students AS
SELECT 
    person_id,
    name,
    email,
    date_of_birth,
    address,
    student_number,
    enrollment_date,
    gpa,
    major,
    credits_earned
FROM person
WHERE person_type = 'STUDENT';
 
CREATE VIEW faculty AS
SELECT 
    person_id,
    name,
    email,
    date_of_birth,
    address,
    faculty_id,
    rank,
    tenure_status,
    specialty,
    office_number
FROM person
WHERE person_type = 'FACULTY';
 
CREATE VIEW staff AS
SELECT 
    person_id,
    name,
    email,
    date_of_birth,
    address,
    staff_id,
    position,
    department,
    hire_date,
    salary_grade
FROM person
WHERE person_type = 'STAFF';
 
-- Now application code uses clean interfaces:
SELECT * FROM students WHERE gpa > 3.5;
SELECT * FROM faculty WHERE tenure_status = 'TENURED';
SELECT * FROM staff WHERE department = 'IT';
 
-- Views hide NULL columns completely - cleaner APIs

View-Based Best Practice

Advanced Constraint Enforcement

Beyond basic NULL validation, STI requires sophisticated constraint strategies to maintain data integrity. Let's examine advanced patterns:

Cross-Attribute Constraints:

Some business rules span multiple attributes within a subtype:

cross_attribute_constraints.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Business Rule: Students must have valid GPA (0.0-4.0) when enrolled
ALTER TABLE person ADD CONSTRAINT chk_student_gpa_range CHECK (
    person_type != 'STUDENT' 
    OR (gpa >= 0.0 AND gpa <= 4.0)
);
 
-- Business Rule: Tenured faculty must be Professor or Associate Professor
ALTER TABLE person ADD CONSTRAINT chk_tenure_rank CHECK (
    person_type != 'FACULTY'
    OR tenure_status != 'TENURED'
    OR rank IN ('PROFESSOR', 'ASSOCIATE_PROFESSOR')
);
 
-- Business Rule: Staff hire_date cannot be in the future
ALTER TABLE person ADD CONSTRAINT chk_staff_hire_date CHECK (
    person_type != 'STAFF'
    OR hire_date <= CURRENT_DATE
);
 
-- Business Rule: If student has credits >= 120, must have major declared
ALTER TABLE person ADD CONSTRAINT chk_student_major_credits CHECK (
    person_type != 'STUDENT'
    OR credits_earned < 120
    OR major IS NOT NULL
);

Unique Constraints Across Subtypes:

Handling uniqueness in STI requires special attention. Each subtype has its own "unique" identifier (student_number, faculty_id, staff_id), but the primary key (person_id) is shared.

unique_constraints.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Each subtype ID must be unique within its type
-- NULL values don't violate UNIQUE constraints, so this works:
ALTER TABLE person ADD CONSTRAINT uniq_student_number 
    UNIQUE (student_number);  -- NULLs excluded
    
ALTER TABLE person ADD CONSTRAINT uniq_faculty_id 
    UNIQUE (faculty_id);  -- NULLs excluded
    
ALTER TABLE person ADD CONSTRAINT uniq_staff_id 
    UNIQUE (staff_id);  -- NULLs excluded
 
-- For cross-hierarchy email uniqueness (already shared):
ALTER TABLE person ADD CONSTRAINT uniq_email UNIQUE (email);
 
-- Partial unique indexes for subtype-specific uniqueness rule:
-- Example: Student numbers unique only among active students
CREATE UNIQUE INDEX idx_active_student_unique 
    ON person (student_number) 
    WHERE person_type = 'STUDENT' AND credits_earned < 200;
 
-- Example: Only one department head per department
CREATE UNIQUE INDEX idx_one_dept_head
    ON person (department)
    WHERE person_type = 'STAFF' AND position = 'DEPARTMENT_HEAD';

Foreign Key Relationships:

Relationships involving subtypes require careful FK design:

foreign_key_relationships.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- Relationships that apply to ALL person types (to supertype)
CREATE TABLE emergency_contact (
    contact_id   SERIAL PRIMARY KEY,
    person_id    INTEGER NOT NULL REFERENCES person(person_id),
    contact_name VARCHAR(100) NOT NULL,
    phone        VARCHAR(20) NOT NULL
);
 
-- Relationships specific to STUDENTS (need validation)
CREATE TABLE enrollment (
    enrollment_id SERIAL PRIMARY KEY,
    person_id     INTEGER NOT NULL REFERENCES person(person_id),
    course_id     INTEGER NOT NULL REFERENCES course(course_id),
    semester      VARCHAR(20) NOT NULL,
    grade         VARCHAR(2),
    
    -- Ensure only students can enroll
    -- Option 1: Trigger-based enforcement
    -- Option 2: Application-level enforcement (common but weaker)
);
 
-- Trigger to enforce student-only enrollment
CREATE OR REPLACE FUNCTION check_student_enrollment()
RETURNS TRIGGER AS $$
BEGIN
    IF NOT EXISTS (
        SELECT 1 FROM person 
        WHERE person_id = NEW.person_id 
        AND person_type = 'STUDENT'
    ) THEN
        RAISE EXCEPTION 'Only students can be enrolled in courses';
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
 
CREATE TRIGGER trg_check_student_enrollment
BEFORE INSERT OR UPDATE ON enrollment
FOR EACH ROW EXECUTE FUNCTION check_student_enrollment();
 
-- Similarly for faculty-specific teaching assignments
CREATE TABLE course_assignment (
    assignment_id SERIAL PRIMARY KEY,
    person_id     INTEGER NOT NULL REFERENCES person(person_id),
    course_id     INTEGER NOT NULL REFERENCES course(course_id),
    semester      VARCHAR(20) NOT NULL
);
 
CREATE TRIGGER trg_check_faculty_assignment
BEFORE INSERT OR UPDATE ON course_assignment
FOR EACH ROW EXECUTE FUNCTION check_faculty_assignment();

STI Foreign Key Limitation

Query Patterns and Performance

STI excels at certain query patterns while struggling with others. Understanding these patterns is crucial for choosing STI appropriately.

Pattern 1: Hierarchy-Wide Queries (STI Strength)

Queries that need data from all subtypes are where STI truly shines:

hierarchy_queries.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Find all people with email ending in @university.edu
-- STI: Simple and fast - single table scan
SELECT person_id, name, email, person_type
FROM person
WHERE email LIKE '%@university.edu';
 
-- Count all people by type
-- STI: Efficient GROUP BY on single table
SELECT person_type, COUNT(*) as count
FROM person
GROUP BY person_type;
 
-- Find all people born in 1990s
SELECT name, date_of_birth, person_type
FROM person
WHERE date_of_birth BETWEEN '1990-01-01' AND '1999-12-31'
ORDER BY date_of_birth;
 
-- Search people by name (regardless of type)
SELECT person_id, name, person_type
FROM person
WHERE name ILIKE '%smith%';
 
-- These queries need NO JOINS - STI's primary advantage

Pattern 2: Subtype-Specific Queries

Queries targeting a single subtype work well with proper indexing:

subtype_queries.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Find students with GPA > 3.5
-- Uses partial index: idx_student_major (major) WHERE person_type = 'STUDENT'
SELECT person_id, name, gpa, major
FROM person
WHERE person_type = 'STUDENT' 
  AND gpa > 3.5
ORDER BY gpa DESC;
 
-- Find tenured faculty in Computer Science
SELECT name, rank, office_number
FROM person
WHERE person_type = 'FACULTY'
  AND tenure_status = 'TENURED'
  AND specialty LIKE '%Computer Science%';
 
-- Find staff in IT department
SELECT name, position, hire_date
FROM person
WHERE person_type = 'STAFF'
  AND department = 'IT';
 
-- IMPORTANT: Always filter by person_type first for optimizer hints
-- This enables partial index usage and reduces scan scope

Pattern 3: Cross-Subtype Aggregations

cross_subtype_aggregations.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Compare average metrics across types (where applicable)
SELECT 
    person_type,
    COUNT(*) as total_count,
    -- Type-specific aggregations require CASE
    AVG(CASE WHEN person_type = 'STUDENT' THEN gpa END) as avg_student_gpa,
    AVG(CASE 
        WHEN person_type = 'STUDENT' THEN credits_earned 
    END) as avg_student_credits,
    COUNT(CASE 
        WHEN person_type = 'FACULTY' AND tenure_status = 'TENURED' 
        THEN 1 
    END) as tenured_faculty_count
FROM person
GROUP BY person_type;
 
-- Find people added in last 30 days by type
SELECT 
    person_type,
    COUNT(*) as new_additions,
    MIN(created_at) as first_added,
    MAX(created_at) as last_added
FROM person
WHERE created_at > CURRENT_DATE - INTERVAL '30 days'
GROUP BY person_type;

Performance Optimization Strategies:

STI Performance Checklist

•Index the discriminator: CREATE INDEX idx_person_type ON person(person_type);
•Use partial indexes: Create indexes scoped to specific person_types for subtype-heavy queries
•Composite indexes: Include discriminator in composite indexes: (person_type, gpa) not just (gpa)
•Table statistics: Ensure ANALYZE runs frequently so optimizer knows type distribution
•Consider table partitioning: For very large tables, partition by person_type (explored in advanced scenarios)

Storage Analysis and Overhead

Understanding the storage implications of STI is crucial for capacity planning and cost analysis. Let's quantify the NULL overhead:

Storage Calculation Example:

Consider our Person table with:

6 shared attributes (~100 bytes)
5 Student attributes (~50 bytes when populated)
5 Faculty attributes (~50 bytes when populated)
5 Staff attributes (~50 bytes when populated)
1 discriminator (~15 bytes)

Per-Row Storage (theoretical):

Storage Analysis: STI vs TPT
Subtype	STI Row Size	Actual Data	NULL Overhead	TPT Equivalent
Student	~165 bytes*	~150 bytes	~15 bytes (NULLs)	~65 bytes (Student) + ~100 bytes (Person) = ~165 bytes
Faculty	~165 bytes*	~150 bytes	~15 bytes (NULLs)	~65 bytes (Faculty) + ~100 bytes (Person) = ~165 bytes
Staff	~165 bytes*	~150 bytes	~15 bytes (NULLs)	~65 bytes (Staff) + ~100 bytes (Person) = ~165 bytes

The NULL Storage Reality:

NULL storage varies by database:

PostgreSQL: NULL bitmap - 1 bit per nullable column (efficient for many NULLs)
MySQL/InnoDB: 1 byte per 8 nullable columns in NULL bitmap
SQL Server: 2-byte NULL bitmap + storage for variable-length NULL indicators

For our example with 15 nullable columns, NULL metadata costs approximately 2-4 bytes per row—much less than you might expect.

storage_analysis.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Analyze actual storage in PostgreSQL
SELECT 
    pg_size_pretty(pg_total_relation_size('person')) as total_size,
    pg_size_pretty(pg_relation_size('person')) as table_size,
    pg_size_pretty(pg_indexes_size('person')) as indexes_size,
    (SELECT COUNT(*) FROM person) as row_count,
    pg_size_pretty(
        pg_total_relation_size('person') / 
        NULLIF((SELECT COUNT(*) FROM person), 0)
    ) as avg_row_size;
 
-- Compare NULL density by type
SELECT 
    person_type,
    COUNT(*) as rows,
    -- Count NULL values per type
    SUM(CASE WHEN student_number IS NULL THEN 1 ELSE 0 END) as null_student,
    SUM(CASE WHEN faculty_id IS NULL THEN 1 ELSE 0 END) as null_faculty,
    SUM(CASE WHEN staff_id IS NULL THEN 1 ELSE 0 END) as null_staff
FROM person
GROUP BY person_type;

Storage Surprise

When to Choose Single Table Inheritance

STI is the right choice under specific conditions. Let's crystallize the decision criteria:

Choose STI When

•Subtypes have few unique attributes (< 5 each)
•Most queries span the hierarchy (70%+)
•Query simplicity is paramount (no JOINs)
•Subtypes are disjoint (no overlap)
•Read performance dominates requirements
•The hierarchy has few subtypes (2-4)
•ORM expects STI (Rails, some Java ORMs)
•Schema changes are infrequent

Avoid STI When

•Subtypes have many unique attributes (10+)
•Subtype-specific queries dominate
•Data integrity is paramount
•Subtypes overlap (multi-role entities)
•Subtype relationships need FK enforcement
•The hierarchy has many subtypes (5+)
•Schema evolution happens frequently
•Storage efficiency is critical

Real-World STI Success Stories:

•E-commerce Products: PhysicalProduct vs DigitalProduct with mostly shared attributes (name, price, description) and few unique ones (weight vs download_url)
•Content Management: Article, BlogPost, NewsItem all sharing title, body, author, date with minimal unique fields
•User Notifications: EmailNotification, PushNotification, SMSNotification sharing recipient, message, timestamp
•Vehicle Fleet: Car, Truck, Van sharing make, model, year, mileage with few type-specific attributes

STI Anti-Patterns

Summary: Mastering Single Table Inheritance

We've comprehensively explored Single Table Inheritance—its implementation, patterns, trade-offs, and appropriate use cases.

Key Takeaways

•Core Concept: STI collapses entire hierarchies into single tables with discriminator columns identifying subtypes.
•NULL Handling: Essential strategies include CHECK constraints, subtype views, and careful indexing.
•Constraint Design: Requires explicit CHECK constraints to enforce subtype-specific business rules.
•Query Advantage: Hierarchy-wide queries are simple and fast—no JOINs required.
•Storage Trade-off: NULL overhead is often smaller than expected; schema width is the real cost.
•Selection Criteria: Best for small hierarchies with shared attributes and hierarchy-spanning queries.
•Integrity Limitation: Cannot use FK constraints for subtype-specific relationships.

What's Next:

Page Complete

2 / 5