Loading learning content...
Single Table Inheritance (STI), also known as Table-Per-Hierarchy (TPH), is the most radical approach to mapping specialization hierarchies. The core idea is deceptively simple: collapse the entire hierarchy into one table.
This means that a supertype and all its subtypes—no matter how many—share a single relational table. All attributes from every entity type in the hierarchy become columns in this unified table. A special discriminator column identifies which subtype each row represents.
The Central Trade-off:
STI trades data integrity and storage efficiency for query simplicity and performance. It's the approach where you pay for simplicity with NULL values and constraint complexity.
Let's explore when this trade-off makes sense, and how to implement STI correctly when you choose it.
By the end of this page, you will master STI implementation including discriminator patterns, NULL handling strategies, CHECK constraint design, schema evolution approaches, and performance optimization techniques. You'll understand exactly when STI is the right choice.
The fundamental STI implementation follows a predictable pattern. Let's build a complete example using our university Person hierarchy:
Step 1: Combine All Attributes
Start by listing every attribute from every entity type in the hierarchy:
Step 2: Add Discriminator Column
The discriminator column is the key to STI. It tells the system (and queries) which subtype each row represents.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
-- Single Table Inheritance: Complete ImplementationCREATE TABLE person ( -- Primary Key (supertype identifier) person_id SERIAL PRIMARY KEY, -- Discriminator Column -- This is the critical element that identifies the subtype person_type VARCHAR(20) NOT NULL, -- Supertype Attributes (common to all) -- These are NEVER NULL (shared by all entities) name VARCHAR(100) NOT NULL, email VARCHAR(100) NOT NULL UNIQUE, date_of_birth DATE NOT NULL, address TEXT, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, -- ═══════════════════════════════════════════════════════ -- STUDENT-SPECIFIC ATTRIBUTES -- NULL for non-students; populated only when person_type = 'STUDENT' -- ═══════════════════════════════════════════════════════ student_number VARCHAR(20) UNIQUE, -- NULL if not student enrollment_date DATE, -- NULL if not student gpa DECIMAL(3,2), -- NULL if not student major VARCHAR(50), -- NULL if not student credits_earned INTEGER DEFAULT 0, -- NULL if not student -- ═══════════════════════════════════════════════════════ -- FACULTY-SPECIFIC ATTRIBUTES -- NULL for non-faculty; populated only when person_type = 'FACULTY' -- ═══════════════════════════════════════════════════════ faculty_id VARCHAR(20) UNIQUE, -- NULL if not faculty rank VARCHAR(30), -- NULL if not faculty tenure_status VARCHAR(20), -- NULL if not faculty specialty VARCHAR(100), -- NULL if not faculty office_number VARCHAR(15), -- NULL if not faculty -- ═══════════════════════════════════════════════════════ -- STAFF-SPECIFIC ATTRIBUTES -- NULL for non-staff; populated only when person_type = 'STAFF' -- ═══════════════════════════════════════════════════════ staff_id VARCHAR(20) UNIQUE, -- NULL if not staff position VARCHAR(50), -- NULL if not staff department VARCHAR(50), -- NULL if not staff hire_date DATE, -- NULL if not staff salary_grade VARCHAR(10), -- NULL if not staff -- ═══════════════════════════════════════════════════════ -- DISCRIMINATOR CONSTRAINT -- Ensures only valid subtype values -- ═══════════════════════════════════════════════════════ CONSTRAINT chk_valid_person_type CHECK (person_type IN ('STUDENT', 'FACULTY', 'STAFF'))); -- Create index on discriminator for fast subtype queriesCREATE INDEX idx_person_type ON person(person_type); -- Create composite indexes for common query patternsCREATE INDEX idx_student_major ON person(major) WHERE person_type = 'STUDENT';CREATE INDEX idx_faculty_rank ON person(rank) WHERE person_type = 'FACULTY';CREATE INDEX idx_staff_department ON person(department) WHERE person_type = 'STAFF';Organize columns by subtype in the schema definition for maintainability. Comment clearly which attributes belong to which subtype. Some DBAs use naming conventions like 'STUDENT__gpa' to make subtype ownership explicit, though this adds verbosity.
The discriminator column is the linchpin of STI. Its design significantly impacts query performance, data integrity, and application integration. Let's examine the various discriminator strategies:
Pattern 1: String Discriminator (Most Common)
Use a VARCHAR column with human-readable type names. This is the most flexible and debuggable approach.
-- String discriminator: Readable and flexibleperson_type VARCHAR(20) NOT NULL CHECK (person_type IN ('STUDENT', 'FACULTY', 'STAFF', 'ALUMNI')) -- Pros:-- ✓ Human-readable in query results-- ✓ Easy to add new types (just update CHECK)-- ✓ Self-documenting in data exports-- ✓ Works with all databases -- Cons:-- ✗ Larger storage per row (vs. integer)-- ✗ Case sensitivity issues possible-- ✗ Typo risk in application codePattern 2: Integer/Enum Discriminator
Use integers or database-native enums for compact storage and faster comparisons.
-- Integer discriminator: Compact and fastperson_type SMALLINT NOT NULL CHECK (person_type IN (1, 2, 3, 4)) -- With lookup table for meaning:CREATE TABLE person_type_lookup ( type_id SMALLINT PRIMARY KEY, type_name VARCHAR(20) NOT NULL UNIQUE);INSERT INTO person_type_lookup VALUES (1, 'STUDENT'), (2, 'FACULTY'), (3, 'STAFF'), (4, 'ALUMNI'); -- PostgreSQL ENUM: Best of both worldsCREATE TYPE person_type_enum AS ENUM ('STUDENT', 'FACULTY', 'STAFF');-- Then in table:person_type person_type_enum NOT NULL -- Pros:-- ✓ Minimal storage (1-2 bytes vs. 10-20 for strings)-- ✓ Faster comparison operations-- ✓ Database-enforced valid values (for ENUM) -- Cons:-- ✗ Requires lookup to understand data-- ✗ ENUM modification can be complex (ALTER TYPE)-- ✗ Integer types not self-documentingPattern 3: Class Name Discriminator (ORM-Friendly)
Store the full class/entity name for direct ORM mapping. Common with Hibernate, Entity Framework, and similar ORMs.
-- Class name discriminator: Direct ORM mappingentity_type VARCHAR(255) NOT NULL -- Values like: 'com.university.model.Student' -- 'com.university.model.Faculty' -- Pros:-- ✓ Seamless ORM integration-- ✓ Supports inheritance hierarchies in code-- ✓ Type resolution automatic -- Cons:-- ✗ Large storage overhead-- ✗ Tied to application package structure-- ✗ Refactoring class names breaks data| Pattern | Storage | Performance | Readability | Flexibility | ORM Support |
|---|---|---|---|---|---|
| String | Medium (10-20 bytes) | Good | ★★★★★ Excellent | ★★★★★ Excellent | ★★★★☆ Good |
| Integer | Small (1-4 bytes) | Best | ★★☆☆☆ Poor | ★★★★☆ Good | ★★★★☆ Good |
| Enum | Small (1-4 bytes) | Best | ★★★★☆ Good | ★★☆☆☆ Limited | ★★★★★ Excellent |
| Class Name | Large (50-255 bytes) | Fair | ★★★☆☆ Fair | ★★★☆☆ Fair | ★★★★★ Excellent |
For most applications, use a string discriminator with an explicit CHECK constraint. It balances readability, flexibility, and performance. Use PostgreSQL ENUM when you need the storage efficiency and your type set is stable. Avoid class name discriminators unless your ORM specifically requires them.
NULLs are the defining characteristic—and the primary weakness—of STI. In a properly designed STI table, every row has NULL values for all attributes belonging to other subtypes. This creates several challenges that must be addressed systematically.
The NULL Problem Illustrated:
For a Person table with 3 subtypes (Student, Faculty, Staff), each having 5 unique attributes:
As you add subtypes or subtype-specific attributes, NULL density grows quadratically.
Strategy 1: Attribute-Level CHECK Constraints
Enforce that subtype-specific attributes are only populated for the correct subtype:
12345678910111213141516171819202122232425262728293031323334353637
-- Enforce correct NULL patterns with CHECK constraints -- Students MUST have student_number; non-students MUST NOTALTER TABLE person ADD CONSTRAINT chk_student_attrs CHECK ( (person_type = 'STUDENT' AND student_number IS NOT NULL AND enrollment_date IS NOT NULL) OR (person_type != 'STUDENT' AND student_number IS NULL AND enrollment_date IS NULL AND gpa IS NULL AND major IS NULL AND credits_earned IS NULL)); -- Faculty validationALTER TABLE person ADD CONSTRAINT chk_faculty_attrs CHECK ( (person_type = 'FACULTY' AND faculty_id IS NOT NULL AND rank IS NOT NULL) OR (person_type != 'FACULTY' AND faculty_id IS NULL AND rank IS NULL AND tenure_status IS NULL AND specialty IS NULL AND office_number IS NULL)); -- Staff validationALTER TABLE person ADD CONSTRAINT chk_staff_attrs CHECK ( (person_type = 'STAFF' AND staff_id IS NOT NULL AND position IS NOT NULL) OR (person_type != 'STAFF' AND staff_id IS NULL AND position IS NULL AND department IS NULL AND hire_date IS NULL AND salary_grade IS NULL));Strategy 2: Default Values for Required Fields
Some systems use sentinel values instead of NULLs to avoid NULL-related query complications:
-- Using sentinel values (USE WITH CAUTION)-- Instead of NULL, use type-specific defaults -- For numeric fieldsgpa DECIMAL(3,2) DEFAULT -1.00, -- -1 indicates "not applicable"credits_earned INTEGER DEFAULT -1, -- For string fields major VARCHAR(50) DEFAULT 'N/A',department VARCHAR(50) DEFAULT 'N/A', -- Querying becomes:SELECT * FROM person WHERE person_type = 'STUDENT' AND gpa != -1.00; -- WARNING: This approach has significant drawbacks:-- ✗ Pollutes data with magic values-- ✗ Aggregations require special handling-- ✗ "N/A" might be a legitimate value someday-- ✗ Different "not applicable" values per type confuse -- RECOMMENDATION: Use actual NULLs with proper constraintsStrategy 3: View-Based Abstraction
Create views that present clean subtype-specific interfaces, hiding the NULL complexity:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
-- Create clean views for each subtypeCREATE VIEW students ASSELECT person_id, name, email, date_of_birth, address, student_number, enrollment_date, gpa, major, credits_earnedFROM personWHERE person_type = 'STUDENT'; CREATE VIEW faculty ASSELECT person_id, name, email, date_of_birth, address, faculty_id, rank, tenure_status, specialty, office_numberFROM personWHERE person_type = 'FACULTY'; CREATE VIEW staff ASSELECT person_id, name, email, date_of_birth, address, staff_id, position, department, hire_date, salary_gradeFROM personWHERE person_type = 'STAFF'; -- Now application code uses clean interfaces:SELECT * FROM students WHERE gpa > 3.5;SELECT * FROM faculty WHERE tenure_status = 'TENURED';SELECT * FROM staff WHERE department = 'IT'; -- Views hide NULL columns completely - cleaner APIsIn production STI implementations, always create subtype-specific views. Applications should primarily interact through these views rather than the base table. This provides abstraction, simplifies queries, and allows schema evolution without breaking application code.
Beyond basic NULL validation, STI requires sophisticated constraint strategies to maintain data integrity. Let's examine advanced patterns:
Cross-Attribute Constraints:
Some business rules span multiple attributes within a subtype:
12345678910111213141516171819202122232425
-- Business Rule: Students must have valid GPA (0.0-4.0) when enrolledALTER TABLE person ADD CONSTRAINT chk_student_gpa_range CHECK ( person_type != 'STUDENT' OR (gpa >= 0.0 AND gpa <= 4.0)); -- Business Rule: Tenured faculty must be Professor or Associate ProfessorALTER TABLE person ADD CONSTRAINT chk_tenure_rank CHECK ( person_type != 'FACULTY' OR tenure_status != 'TENURED' OR rank IN ('PROFESSOR', 'ASSOCIATE_PROFESSOR')); -- Business Rule: Staff hire_date cannot be in the futureALTER TABLE person ADD CONSTRAINT chk_staff_hire_date CHECK ( person_type != 'STAFF' OR hire_date <= CURRENT_DATE); -- Business Rule: If student has credits >= 120, must have major declaredALTER TABLE person ADD CONSTRAINT chk_student_major_credits CHECK ( person_type != 'STUDENT' OR credits_earned < 120 OR major IS NOT NULL);Unique Constraints Across Subtypes:
Handling uniqueness in STI requires special attention. Each subtype has its own "unique" identifier (student_number, faculty_id, staff_id), but the primary key (person_id) is shared.
123456789101112131415161718192021222324
-- Each subtype ID must be unique within its type-- NULL values don't violate UNIQUE constraints, so this works:ALTER TABLE person ADD CONSTRAINT uniq_student_number UNIQUE (student_number); -- NULLs excluded ALTER TABLE person ADD CONSTRAINT uniq_faculty_id UNIQUE (faculty_id); -- NULLs excluded ALTER TABLE person ADD CONSTRAINT uniq_staff_id UNIQUE (staff_id); -- NULLs excluded -- For cross-hierarchy email uniqueness (already shared):ALTER TABLE person ADD CONSTRAINT uniq_email UNIQUE (email); -- Partial unique indexes for subtype-specific uniqueness rule:-- Example: Student numbers unique only among active studentsCREATE UNIQUE INDEX idx_active_student_unique ON person (student_number) WHERE person_type = 'STUDENT' AND credits_earned < 200; -- Example: Only one department head per departmentCREATE UNIQUE INDEX idx_one_dept_head ON person (department) WHERE person_type = 'STAFF' AND position = 'DEPARTMENT_HEAD';Foreign Key Relationships:
Relationships involving subtypes require careful FK design:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
-- Relationships that apply to ALL person types (to supertype)CREATE TABLE emergency_contact ( contact_id SERIAL PRIMARY KEY, person_id INTEGER NOT NULL REFERENCES person(person_id), contact_name VARCHAR(100) NOT NULL, phone VARCHAR(20) NOT NULL); -- Relationships specific to STUDENTS (need validation)CREATE TABLE enrollment ( enrollment_id SERIAL PRIMARY KEY, person_id INTEGER NOT NULL REFERENCES person(person_id), course_id INTEGER NOT NULL REFERENCES course(course_id), semester VARCHAR(20) NOT NULL, grade VARCHAR(2), -- Ensure only students can enroll -- Option 1: Trigger-based enforcement -- Option 2: Application-level enforcement (common but weaker)); -- Trigger to enforce student-only enrollmentCREATE OR REPLACE FUNCTION check_student_enrollment()RETURNS TRIGGER AS $$BEGIN IF NOT EXISTS ( SELECT 1 FROM person WHERE person_id = NEW.person_id AND person_type = 'STUDENT' ) THEN RAISE EXCEPTION 'Only students can be enrolled in courses'; END IF; RETURN NEW;END;$$ LANGUAGE plpgsql; CREATE TRIGGER trg_check_student_enrollmentBEFORE INSERT OR UPDATE ON enrollmentFOR EACH ROW EXECUTE FUNCTION check_student_enrollment(); -- Similarly for faculty-specific teaching assignmentsCREATE TABLE course_assignment ( assignment_id SERIAL PRIMARY KEY, person_id INTEGER NOT NULL REFERENCES person(person_id), course_id INTEGER NOT NULL REFERENCES course(course_id), semester VARCHAR(20) NOT NULL); CREATE TRIGGER trg_check_faculty_assignmentBEFORE INSERT OR UPDATE ON course_assignmentFOR EACH ROW EXECUTE FUNCTION check_faculty_assignment();STI cannot use database-level foreign keys to restrict relationships to specific subtypes. You cannot create a FK that only references 'Student' rows. This requires triggers or application logic—a significant integrity trade-off compared to TPT.
STI excels at certain query patterns while struggling with others. Understanding these patterns is crucial for choosing STI appropriately.
Pattern 1: Hierarchy-Wide Queries (STI Strength)
Queries that need data from all subtypes are where STI truly shines:
123456789101112131415161718192021222324
-- Find all people with email ending in @university.edu-- STI: Simple and fast - single table scanSELECT person_id, name, email, person_typeFROM personWHERE email LIKE '%@university.edu'; -- Count all people by type-- STI: Efficient GROUP BY on single tableSELECT person_type, COUNT(*) as countFROM personGROUP BY person_type; -- Find all people born in 1990sSELECT name, date_of_birth, person_typeFROM personWHERE date_of_birth BETWEEN '1990-01-01' AND '1999-12-31'ORDER BY date_of_birth; -- Search people by name (regardless of type)SELECT person_id, name, person_typeFROM personWHERE name ILIKE '%smith%'; -- These queries need NO JOINS - STI's primary advantagePattern 2: Subtype-Specific Queries
Queries targeting a single subtype work well with proper indexing:
1234567891011121314151617181920212223
-- Find students with GPA > 3.5-- Uses partial index: idx_student_major (major) WHERE person_type = 'STUDENT'SELECT person_id, name, gpa, majorFROM personWHERE person_type = 'STUDENT' AND gpa > 3.5ORDER BY gpa DESC; -- Find tenured faculty in Computer ScienceSELECT name, rank, office_numberFROM personWHERE person_type = 'FACULTY' AND tenure_status = 'TENURED' AND specialty LIKE '%Computer Science%'; -- Find staff in IT departmentSELECT name, position, hire_dateFROM personWHERE person_type = 'STAFF' AND department = 'IT'; -- IMPORTANT: Always filter by person_type first for optimizer hints-- This enables partial index usage and reduces scan scopePattern 3: Cross-Subtype Aggregations
12345678910111213141516171819202122232425
-- Compare average metrics across types (where applicable)SELECT person_type, COUNT(*) as total_count, -- Type-specific aggregations require CASE AVG(CASE WHEN person_type = 'STUDENT' THEN gpa END) as avg_student_gpa, AVG(CASE WHEN person_type = 'STUDENT' THEN credits_earned END) as avg_student_credits, COUNT(CASE WHEN person_type = 'FACULTY' AND tenure_status = 'TENURED' THEN 1 END) as tenured_faculty_countFROM personGROUP BY person_type; -- Find people added in last 30 days by typeSELECT person_type, COUNT(*) as new_additions, MIN(created_at) as first_added, MAX(created_at) as last_addedFROM personWHERE created_at > CURRENT_DATE - INTERVAL '30 days'GROUP BY person_type;Performance Optimization Strategies:
CREATE INDEX idx_person_type ON person(person_type);(person_type, gpa) not just (gpa)ANALYZE runs frequently so optimizer knows type distributionUnderstanding the storage implications of STI is crucial for capacity planning and cost analysis. Let's quantify the NULL overhead:
Storage Calculation Example:
Consider our Person table with:
Per-Row Storage (theoretical):
| Subtype | STI Row Size | Actual Data | NULL Overhead | TPT Equivalent |
|---|---|---|---|---|
| Student | ~165 bytes* | ~150 bytes | ~15 bytes (NULLs) | ~65 bytes (Student) + ~100 bytes (Person) = ~165 bytes |
| Faculty | ~165 bytes* | ~150 bytes | ~15 bytes (NULLs) | ~65 bytes (Faculty) + ~100 bytes (Person) = ~165 bytes |
| Staff | ~165 bytes* | ~150 bytes | ~15 bytes (NULLs) | ~65 bytes (Staff) + ~100 bytes (Person) = ~165 bytes |
The NULL Storage Reality:
NULL storage varies by database:
For our example with 15 nullable columns, NULL metadata costs approximately 2-4 bytes per row—much less than you might expect.
123456789101112131415161718192021
-- Analyze actual storage in PostgreSQLSELECT pg_size_pretty(pg_total_relation_size('person')) as total_size, pg_size_pretty(pg_relation_size('person')) as table_size, pg_size_pretty(pg_indexes_size('person')) as indexes_size, (SELECT COUNT(*) FROM person) as row_count, pg_size_pretty( pg_total_relation_size('person') / NULLIF((SELECT COUNT(*) FROM person), 0) ) as avg_row_size; -- Compare NULL density by typeSELECT person_type, COUNT(*) as rows, -- Count NULL values per type SUM(CASE WHEN student_number IS NULL THEN 1 ELSE 0 END) as null_student, SUM(CASE WHEN faculty_id IS NULL THEN 1 ELSE 0 END) as null_faculty, SUM(CASE WHEN staff_id IS NULL THEN 1 ELSE 0 END) as null_staffFROM personGROUP BY person_type;STI's storage overhead from NULLs is often less than expected. The real storage cost is the additional columns themselves (schema width), not the NULL values. For hierarchies with few subtype-specific attributes, STI storage can match or beat TPT due to eliminated key duplication.
STI is the right choice under specific conditions. Let's crystallize the decision criteria:
Real-World STI Success Stories:
Avoid STI when you find yourself: (1) Adding dozens of columns for new subtypes, (2) Writing complex triggers to enforce subtype constraints, (3) Creating relationships that logically only apply to certain subtypes, (4) Frequently adding new subtypes to the hierarchy. These are signals that TPT would be more appropriate.
We've comprehensively explored Single Table Inheritance—its implementation, patterns, trade-offs, and appropriate use cases.
What's Next:
The next page explores the Multiple Table Approach (TPT)—where each entity type gets its own table. We'll see how this trades query complexity for data integrity and flexibility, and when that trade-off makes sense.
You now have comprehensive knowledge of Single Table Inheritance including implementation, discriminator patterns, NULL handling, constraint enforcement, and selection criteria. You can confidently implement STI when it's the right choice and recognize when alternatives are more appropriate.