Loading learning content...
Every programming language has a type system—a formal mechanism that classifies values into categories (integers, strings, booleans) and enforces rules about how values of different types can interact. In the relational model, domains serve this role. But domains are more than just data types; they are semantic pools of meaningful values that give structure and integrity to relational data.
When E.F. Codd formalized the relational model, he drew from set theory and mathematical logic. A domain, in this context, is a set—a well-defined collection of values. Attributes are then understood as functions from tuples to domain values. This mathematical rigor is what allows relational databases to provide guarantees about data consistency, query correctness, and optimization.
Understanding domains deeply is essential because domains are where data integrity begins. If you don't carefully define what values are permissible, you cannot build reliable systems.
By the end of this page, you will understand the formal definition of domains, how domains relate to data types, the distinction between syntactic and semantic domains, how domains enable type-safe operations, and the practical implications of domain design for data integrity and query expressiveness.
In mathematical terms, a domain is a named set of values. Each value in the set is an atomic element—it cannot be decomposed into smaller meaningful parts within the database context.
Formal Definition:
A domain (D) is a named set of atomic values with a defined data type and optional constraints. Formally:
$$D = {v_1, v_2, v_3, ..., v_n} \text{ where each } v_i \text{ satisfies the domain's type and constraints}$$
In practice, most domains are defined intensionally (by specifying rules) rather than extensionally (by listing all values). For example:
Age domain: All non-negative integers less than 150Email domain: All strings matching a valid email patternCurrency_Code domain: The ISO 4217 currency codes {USD, EUR, GBP, JPY, ...}An extensional definition lists all elements: Colors = {red, green, blue}. An intensional definition specifies a rule: PositiveIntegers = {x ∈ ℤ | x > 0}. Most database domains are intensional since they may contain infinite or very large sets of values.
123456789101112131415161718192021222324252627282930313233343536373839404142
-- PostgreSQL CREATE DOMAIN creates custom domains with constraints -- Domain: A valid age (non-negative, reasonable maximum)CREATE DOMAIN age_years AS INTEGER CHECK (VALUE >= 0 AND VALUE < 150); -- Domain: A valid email address (pattern-based constraint)CREATE DOMAIN email_address AS VARCHAR(255) CHECK (VALUE ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'); -- Domain: A monetary amount (precise decimal, non-negative)CREATE DOMAIN monetary_amount AS DECIMAL(15, 2) CHECK (VALUE >= 0); -- Domain: A US ZIP code (5 digits or ZIP+4 format)CREATE DOMAIN us_zip_code AS VARCHAR(10) CHECK (VALUE ~ '^[0-9]{5}(-[0-9]{4})?$'); -- Domain: An ISO country code (exactly 2 uppercase letters)CREATE DOMAIN country_code AS CHAR(2) CHECK (VALUE ~ '^[A-Z]{2}$'); -- Domain: A percentage (0 to 100 inclusive)CREATE DOMAIN percentage AS DECIMAL(5, 2) CHECK (VALUE >= 0 AND VALUE <= 100); -- Using domains in table definitionsCREATE TABLE Customer ( customer_id SERIAL PRIMARY KEY, email email_address NOT NULL UNIQUE, age age_years, country country_code NOT NULL DEFAULT 'US'); CREATE TABLE Order_Line ( order_id INT NOT NULL, product_id INT NOT NULL, quantity INTEGER NOT NULL CHECK (quantity > 0), unit_price monetary_amount NOT NULL, discount_pct percentage DEFAULT 0, PRIMARY KEY (order_id, product_id));Components of a Domain Definition:
Name — A descriptive identifier (e.g., email_address, age_years)
Base Type — The underlying data type from the DBMS (INTEGER, VARCHAR, DECIMAL, etc.)
Constraints — Additional restrictions beyond the base type (ranges, patterns, enumerations)
Default Value (optional) — A value to use when none is specified
Nullability (implicit) — Whether NULL is a valid domain value (controversial in theory, ubiquitous in practice)
One of the most common sources of confusion is conflating domains (a relational model concept) with data types (an implementation construct). While related, they serve different purposes and operate at different levels of abstraction.
| Aspect | Domain (Theory) | Data Type (SQL) |
|---|---|---|
| Scope | Semantic—describes meaningful value sets | Syntactic—describes storage and operations |
| Naming | Application-specific names (CustomerID, OrderTotal) | Generic type names (INT, DECIMAL, VARCHAR) |
| Granularity | Fine-grained (different domains for employee_age vs building_age) | Coarse-grained (both would be INTEGER) |
| Purpose | Enable semantic type checking | Enable physical storage and basic operations |
| Constraints | Business rules encoded in domain definition | Basic type rules (size, precision) |
| Comparability | Only values from same domain should be compared | Any values of same type can be compared |
| Support | Partial support via CREATE DOMAIN (PostgreSQL, etc.) | Universal—every DBMS has data types |
SQL's type system allows semantically nonsensical operations. You can compare employee.age with order.quantity because both are INTEGERs—even though comparing a person's age to a quantity of products is meaningless. This is a fundamental weakness in SQL's domain support, and careful schema design must compensate.
The Semantic Gap Example:
Consider a schema with these attributes:
employee_age — A person's age in yearsorder_quantity — Number of items orderedproduct_id — A product identifierwarehouse_id — A warehouse identifierIn pure relational theory with proper domains:
employee_age to order_quantity would be a type error (different domains)product_id to warehouse_id would be a type error (different domains)In SQL with generic data types:
INTEGERWHERE product_id = warehouse_id go undetected1234567891011121314151617181920212223242526272829303132333435363738394041424344
-- APPROACH 1: Using generic data types (common but problematic)CREATE TABLE Employee_v1 ( employee_id INT PRIMARY KEY, employee_age INT, department_id INT); CREATE TABLE Order_v1 ( order_id INT PRIMARY KEY, quantity INT, product_id INT); -- This NONSENSICAL query is allowed because everything is INT:SELECT * FROM Employee_v1 e, Order_v1 oWHERE e.employee_age = o.quantity; -- Comparing age to quantity?! -- APPROACH 2: Using explicit domains (PostgreSQL)CREATE DOMAIN person_age AS SMALLINT CHECK (VALUE >= 0 AND VALUE < 150); CREATE DOMAIN item_quantity AS INTEGER CHECK (VALUE > 0); CREATE DOMAIN employee_identifier AS INTEGER;CREATE DOMAIN department_identifier AS INTEGER;CREATE DOMAIN order_identifier AS INTEGER;CREATE DOMAIN product_identifier AS INTEGER; CREATE TABLE Employee_v2 ( employee_id employee_identifier PRIMARY KEY, employee_age person_age, department_id department_identifier); CREATE TABLE Order_v2 ( order_id order_identifier PRIMARY KEY, quantity item_quantity, product_id product_identifier); -- Now domains provide (limited) semantic meaning-- Though SQL still allows cross-domain comparison (a limitation)-- At minimum, CHECK constraints prevent invalid valuesBest Practices for Domain Thinking:
Think in Domains, Implement in Types — Even if your DBMS doesn't fully support domains, conceptualize your schema using domain thinking. Document which attributes share domains.
Use CREATE DOMAIN When Available — PostgreSQL, DB2, and some other databases support CREATE DOMAIN. Use it to encode business rules.
Naming Conventions as Domain Proxies — Use attribute naming to indicate domain membership: all identifiers end in _id, all timestamps end in _at, all monetary amounts end in _amount.
Single Source of Truth for Domain Rules — If multiple attributes belong to the same domain, define the domain once and reuse it. Changes propagate automatically.
Every domain possesses certain characteristics that determine how its values can be used, compared, and processed. Understanding these characteristics is essential for proper database design.
| Domain Type | Cardinality | Ordered | Examples |
|---|---|---|---|
| Boolean | 2 (true/false) | No | is_active, has_shipped |
| Enumerated | Small finite set | Sometimes | order_status, gender |
| Integer | Infinite (bounded in practice) | Yes | quantity, count |
| Real/Decimal | Infinite (bounded precision) | Yes | price, salary |
| String | Infinite | Lexicographic | name, description |
| Temporal | Infinite (bounded range) | Yes | created_at, birth_date |
| Identifier | Large finite or infinite | Depends | customer_id, uuid |
| Binary | Infinite | Sometimes | image_blob, file_hash |
The distinction between ordered and unordered domains has profound query implications. Ordered domains support range queries (BETWEEN), sorting (ORDER BY), and aggregate functions like MIN/MAX. Unordered domains only support equality comparison. Attempting ORDER BY on a truly unordered domain (like a complex object) may produce meaningless results.
12345678910111213141516171819202122232425262728293031323334353637383940
-- PostgreSQL ENUM creates a finite ordered domainCREATE TYPE order_status AS ENUM ( 'pending', -- Ordered first 'confirmed', 'processing', 'shipped', 'delivered', 'cancelled' -- Ordered last); -- The enum has natural ordering for comparisonCREATE TABLE orders ( order_id SERIAL PRIMARY KEY, status order_status NOT NULL DEFAULT 'pending'); -- Enum ordering enables meaningful comparisonSELECT * FROM orders WHERE status >= 'processing'; -- Returns processing, shipped, delivered SELECT * FROM ordersWHERE status BETWEEN 'confirmed' AND 'shipped'; -- Boolean domain: exactly two valuesCREATE TABLE feature_flags ( flag_name VARCHAR(100) PRIMARY KEY, is_enabled BOOLEAN NOT NULL DEFAULT false); -- Integer domain with constraints (simulating bounded cardinality)CREATE DOMAIN day_of_month AS SMALLINT CHECK (VALUE >= 1 AND VALUE <= 31); CREATE DOMAIN rating_score AS SMALLINT CHECK (VALUE >= 1 AND VALUE <= 5); -- String domain with pattern constraintCREATE DOMAIN product_sku AS VARCHAR(20) CHECK (VALUE ~ '^[A-Z]{3}-[0-9]{4}-[A-Z0-9]{4}$'); -- Format: ABC-1234-XY99A crucial distinction in domain theory separates syntactic domains (what values look like) from semantic domains (what values mean). This distinction has profound implications for database correctness.
VARCHAR(10), DECIMAL(10,2)^.+@.+\..+$The Important Distinction:
Consider an email address attribute:
Syntactically valid: definitely-not-real@fake-domain-xyz.invalid
Semantically valid: support@google.com
Databases can enforce syntactic domains (via CHECK constraints, types, domains). Semantic validity often requires:
Domain validation exists on a spectrum from purely syntactic to deeply semantic. Design your system to enforce:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
-- Layer 1: SYNTACTIC domain validation in database -- Email must match pattern (syntactic)CREATE DOMAIN email_address AS VARCHAR(255) CHECK (VALUE ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'); -- Phone must match pattern (syntactic)CREATE DOMAIN phone_number AS VARCHAR(20) CHECK (VALUE ~ '^\+?[0-9]{10,15}$'); -- Layer 2: REFERENCE domain validation via foreign keys -- Country codes must exist in reference table (semantic via reference)CREATE TABLE ref_country ( country_code CHAR(2) PRIMARY KEY, country_name VARCHAR(100) NOT NULL, is_active BOOLEAN DEFAULT true); -- Insert reference dataINSERT INTO ref_country (country_code, country_name) VALUES ('US', 'United States'), ('GB', 'United Kingdom'), ('DE', 'Germany'), ('FR', 'France'); -- Address table with semantic validation via FKCREATE TABLE address ( address_id SERIAL PRIMARY KEY, street_line VARCHAR(200) NOT NULL, city VARCHAR(100) NOT NULL, country_code CHAR(2) NOT NULL REFERENCES ref_country(country_code) -- FK ensures country_code is semantically valid (exists in reference)); -- Layer 3: SEMANTIC validation requiring business logic -- Currency codes with exchange rate validationCREATE TABLE ref_currency ( currency_code CHAR(3) PRIMARY KEY, currency_name VARCHAR(50) NOT NULL, is_tradeable BOOLEAN NOT NULL DEFAULT true, last_verified TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP); -- A trigger could verify currency is still tradeable before insertCREATE OR REPLACE FUNCTION check_currency_active()RETURNS TRIGGER AS $$BEGIN IF NOT EXISTS ( SELECT 1 FROM ref_currency WHERE currency_code = NEW.currency_code AND is_tradeable = true ) THEN RAISE EXCEPTION 'Currency % is not currently tradeable', NEW.currency_code; END IF; RETURN NEW;END;$$ LANGUAGE plpgsql;In the pure relational model, operations like comparison, union, and join should only occur between domain-compatible values. Understanding domain compatibility is essential for preventing semantic errors in queries.
Definition of Domain Compatibility:
Two attributes are domain-compatible if:
Domain-compatible operations:
=, <>, <, >, etc.)Domain-incompatible operations (should be prevented or carefully considered):
customer_id with product_id (different entity namespaces)employee.age with order.quantity (meaningless correlation)temperatures table with names table (incompatible semantics)| Attribute 1 | Attribute 2 | Compatible? | Reason |
|---|---|---|---|
employee.department_id | department.department_id | Yes | Same domain (department identifier) |
order.customer_id | customer.customer_id | Yes | Same domain (customer identifier) |
employee.age | dependent.age | Yes | Same domain (person age) |
product.price | product.cost | Yes | Same domain (monetary amount) |
employee.id | product.id | No | Different entity namespaces |
sensor.temperature | employee.salary | No | Unrelated measurements |
order.status | shipment.status | Maybe | Depends—may have different value sets |
created_at | updated_at | Yes | Same domain (timestamp) |
SQL does NOT prevent domain-incompatible comparisons at the type level. As long as the underlying data types match, SQL allows any comparison. You can write WHERE employee.age = order.quantity without error. This is a known limitation—rely on code reviews, naming conventions, and application-level checks.
12345678910111213141516171819202122232425262728293031
-- Domain-compatible operations (CORRECT) -- Two attributes from the same domain: department_identifierSELECT e.employee_name, d.department_nameFROM employees eJOIN departments d ON e.department_id = d.department_id; -- Same domain: ✓ -- Two attributes from the same domain: customer_identifierSELECT o.order_id, c.customer_nameFROM orders oJOIN customers c ON o.customer_id = c.customer_id; -- Same domain: ✓ -- Domain-INCOMPATIBLE operations (BUGS that SQL allows) -- BUG: Comparing employee ID with product ID (different entity namespaces)SELECT * FROM employees e, products pWHERE e.employee_id = p.product_id; -- Syntactically valid, semantically meaningless -- BUG: Joining on semantically unrelated attributesSELECT * FROM orders o, inventory iWHERE o.quantity = i.quantity; -- Both INT, but unrelated measurements -- MITIGATION: Use naming conventions to make domain membership obvious-- All identifiers: entity_id pattern-- All amounts: entity_amount pattern-- All timestamps: action_at pattern -- With consistent naming, code review can catch cross-domain bugsSELECT e.customer_id, p.product_id -- Different domains obvious from namesFROM ...WHERE e.customer_id = p.product_id; -- Reviewer should flag thisPractical Domain Compatibility Strategies:
Naming Convention Discipline — Use consistent naming patterns that indicate domain membership. All customer identifiers should be customer_id (not cust_id, custID, customer). This makes incompatible comparisons visually obvious.
Explicit Foreign Key Paths — When joining tables, follow clear foreign key relationships. Random cross-table comparisons are usually bugs.
Domain Documentation — In your data dictionary, document which attributes share domains. employee.salary and contractor.rate might both be monetary amounts from the same domain.
Query Review Practices — In code reviews, verify that all comparisons and joins occur between domain-compatible attributes. Question any comparison that doesn't follow an obvious relationship.
Database systems provide built-in domains (the standard data types) and allow definition of user-defined domains (custom types with constraints). Understanding when to use each is key to effective schema design.
| Category | Types | Use Cases |
|---|---|---|
| Numeric (Exact) | INTEGER, SMALLINT, BIGINT, DECIMAL(p,s), NUMERIC(p,s) | Counts, IDs, monetary values, precise calculations |
| Numeric (Approximate) | REAL, DOUBLE PRECISION, FLOAT | Scientific measurements, statistics, approximations |
| Character | CHAR(n), VARCHAR(n), TEXT | Names, descriptions, codes, free-form text |
| Binary | BINARY, VARBINARY, BLOB | Files, images, encryption keys, serialized objects |
| Temporal | DATE, TIME, TIMESTAMP, INTERVAL | Dates, times, durations, event timestamps |
| Boolean | BOOLEAN | Flags, binary states, yes/no attributes |
| UUID | UUID | Globally unique identifiers, distributed IDs |
| JSON/JSONB | JSON, JSONB | Semi-structured data, dynamic schemas, API payloads |
When to Create User-Defined Domains:
Multiple attributes share the same constraints — If email appears in 5 tables with the same pattern constraint, define an email_address domain once.
Business rules beyond basic types — If a percentage must be between 0 and 100, create a percentage domain rather than repeating CHECK constraints.
Semantic clarity is important — Even if two attributes are both VARCHAR(100), they may represent different domains (customer_name vs product_sku).
Centralized change management — When business rules change, updating a domain definition propagates to all uses.
Self-documenting schemas — salary monetary_amount NOT NULL is more informative than salary DECIMAL(15,2) NOT NULL.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
-- User-defined domains for a business application -- Monetary values: precise decimal, always non-negativeCREATE DOMAIN monetary_amount AS DECIMAL(15, 2) CHECK (VALUE >= 0) CONSTRAINT monetary_amount_positive NOT NULL; -- Tax rates: percentage with boundsCREATE DOMAIN tax_rate AS DECIMAL(5, 4) CHECK (VALUE >= 0 AND VALUE <= 0.5) -- Up to 50% DEFAULT 0; -- Product SKU: specific format enforcedCREATE DOMAIN product_sku AS VARCHAR(20) CHECK (VALUE ~ '^[A-Z]{2,4}-[0-9]{3,6}$') NOT NULL; -- Email addresses: pattern validationCREATE DOMAIN email_address AS VARCHAR(255) CHECK (VALUE ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'); -- US phone numbers: standardized formatCREATE DOMAIN us_phone AS CHAR(12) CHECK (VALUE ~ '^[0-9]{3}-[0-9]{3}-[0-9]{4}$'); -- URL format: basic pattern validationCREATE DOMAIN url AS VARCHAR(2048) CHECK (VALUE ~* '^https?://[a-z0-9.-]+\.[a-z]{2,}(/.*)?$'); -- UUID: for distributed ID generationCREATE DOMAIN entity_uuid AS UUID DEFAULT gen_random_uuid(); -- Use domains in table definitionsCREATE TABLE product ( product_id entity_uuid PRIMARY KEY, sku product_sku UNIQUE, name VARCHAR(200) NOT NULL, unit_price monetary_amount, cost monetary_amount, tax_rate_code tax_rate); CREATE TABLE customer ( customer_id entity_uuid PRIMARY KEY, email email_address UNIQUE, phone us_phone, website url); -- Domain changes propagate automatically-- If tax policy changes, update the domain:ALTER DOMAIN tax_rate DROP CONSTRAINT tax_rate_check, ADD CONSTRAINT tax_rate_check CHECK (VALUE >= 0 AND VALUE <= 0.6);-- All tables using tax_rate now accept up to 60%Create a library of standard domains for your organization: monetary_amount, email_address, phone_number, percentage, entity_identifier, etc. Like a shared component library in frontend development, this promotes consistency and reduces errors across all database projects.
Domains can exist in hierarchical relationships where one domain is a subset of another. This concept of subdomains (or domain specialization) mirrors inheritance in object-oriented programming and has important practical applications.
Domain Hierarchy Examples:
STRINGS (all character strings)
├── IDENTIFIERS (strings used as unique identifiers)
│ ├── EMAIL_ADDRESSES (valid email format)
│ ├── PHONE_NUMBERS (valid phone format)
│ └── URLS (valid URL format)
├── NAMES (strings representing names)
│ ├── PERSON_NAMES (human names)
│ └── ORGANIZATION_NAMES (company names)
└── CODES (short string codes)
├── COUNTRY_CODES (ISO 2-letter)
├── CURRENCY_CODES (ISO 3-letter)
└── STATUS_CODES (application-specific)
NUMBERS (all numeric values)
├── INTEGERS (whole numbers)
│ ├── COUNTS (non-negative integers)
│ │ └── POSITIVE_COUNTS (strictly positive)
│ └── IDENTIFIERS (numeric IDs)
├── MONETARY_AMOUNTS (decimal with currency precision)
│ ├── PRICES (sale prices)
│ └── COSTS (purchase costs)
└── PERCENTAGES (0-100 or 0-1 range)
Implications of Domain Hierarchies:
Subtype Substitutability — A value from a subdomain can be used wherever the parent domain is expected. An EMAIL_ADDRESS can be stored in any IDENTIFIER field.
Constraint Inheritance — Subdomains inherit all constraints from parent domains and add additional constraints.
Semantic Refinement — Each level in the hierarchy adds semantic meaning while preserving type compatibility for operations.
Standard SQL doesn't directly support domain hierarchies. PostgreSQL's CREATE DOMAIN is flat—you cannot specify that one domain inherits from another. Object-relational databases (PostgreSQL's type system, Oracle's object types) offer some hierarchy support. For most projects, hierarchy is implemented through documentation and naming conventions.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
-- Simulating domain hierarchy with layered domains -- Level 1: Base numeric domainCREATE DOMAIN positive_decimal AS DECIMAL(15, 2) CHECK (VALUE > 0); -- Level 2: Specialized monetary domain (inherits implied positivity)CREATE DOMAIN monetary_amount AS DECIMAL(15, 2) CHECK (VALUE >= 0); -- Monetary can be zero (unlike positive_decimal) -- Level 3: Further specialization for specific purposesCREATE DOMAIN sale_price AS DECIMAL(15, 2) CHECK (VALUE > 0); -- Prices must be positive CREATE DOMAIN discount_amount AS DECIMAL(15, 2) CHECK (VALUE >= 0); -- Discounts are non-negative -- Base string domain for identifiersCREATE DOMAIN identifier AS VARCHAR(255) CHECK (LENGTH(TRIM(VALUE)) > 0); -- Not empty or whitespace -- Specialized identifier domainsCREATE DOMAIN email_address AS VARCHAR(255) CHECK (VALUE ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'); CREATE DOMAIN url_address AS VARCHAR(2048) CHECK (VALUE ~* '^https?://'); -- Code domains with specific formatsCREATE DOMAIN country_code AS CHAR(2) CHECK (VALUE ~ '^[A-Z]{2}$'); CREATE DOMAIN currency_code AS CHAR(3) CHECK (VALUE ~ '^[A-Z]{3}$'); CREATE DOMAIN language_code AS VARCHAR(5) CHECK (VALUE ~ '^[a-z]{2}(-[A-Z]{2})?$'); -- en, en-US, fr-CA -- Using hierarchical domainsCREATE TABLE product ( product_id SERIAL PRIMARY KEY, name VARCHAR(200) NOT NULL, -- All monetary fields share domain semantics list_price sale_price NOT NULL, cost_price monetary_amount NOT NULL, discount discount_amount DEFAULT 0, -- Reference domains country_origin country_code, base_currency currency_code NOT NULL DEFAULT 'USD'); CREATE TABLE customer ( customer_id SERIAL PRIMARY KEY, email email_address NOT NULL UNIQUE, website url_address, locale language_code DEFAULT 'en-US');We've explored domains in depth—from mathematical foundations to practical implementation patterns. Let's consolidate the essential concepts:
What's Next:
With domains defined, we now examine atomic values—the fundamental requirement that every cell in a relation contains a single, indivisible value. This requirement is foundational to the First Normal Form and shapes how we model real-world complexity.
You now understand domains rigorously—as mathematical sets, as semantic type systems, and as practical constraint mechanisms. This understanding is essential for designing schemas with proper data integrity, writing correct queries, and building systems that maintain consistency as data scales.