Loading learning content...
A database without constraints is merely a data container—it accepts anything, guarantees nothing, and trusts everything. Constraints transform a passive container into an active guardian of data quality, automatically enforcing the rules that distinguish valid data from invalid.
Constraint notation provides the formal language for expressing these rules precisely. Without it, constraints remain ambiguous English descriptions prone to misinterpretation. With proper notation, constraints become unambiguous specifications that can be:
This page explores the notational conventions for all major constraint types, from fundamental key constraints to complex semantic rules.
After studying this page, you will be able to:
• Express key constraints using formal notation • Specify referential integrity constraints with actions • Write domain and check constraint specifications • Formalize functional and multivalued dependencies • Document complex business rules in constraint notation
Key constraints ensure tuple uniqueness—no two tuples can have identical values for key attributes. Formal notation precisely distinguishes between superkeys, candidate keys, and primary keys.
A superkey is any set of attributes that uniquely identifies tuples:
SK(R) = X means attribute set X is a superkey for relation R
Formal definition: X is a superkey of R if and only if: ∀t₁, t₂ ∈ r(R): t₁[X] = t₂[X] → t₁ = t₂
A candidate key is a minimal superkey (no proper subset is a superkey):
CK(R) = X means X is a candidate key for R
Formal definition: X is a candidate key of R iff:
The designated main identifier from among candidate keys:
PK(R) = X or R(..., X̲, ...) (underlined in schema)
1234567891011121314151617181920212223242526272829303132333435363738
// KEY CONSTRAINT NOTATION// ======================== // SchemaEMPLOYEE(emp_id, ssn, email, name, dept_id, salary) // Superkey Examples (not necessarily minimal)SK₁(EMPLOYEE) = {emp_id}SK₂(EMPLOYEE) = {ssn}SK₃(EMPLOYEE) = {email}SK₄(EMPLOYEE) = {emp_id, name} // Superkey (contains emp_id)SK₅(EMPLOYEE) = {ssn, email, name} // Superkey (contains ssn) // Candidate Keys (minimal superkeys)CK₁(EMPLOYEE) = {emp_id} // emp_id alone is uniqueCK₂(EMPLOYEE) = {ssn} // ssn alone is uniqueCK₃(EMPLOYEE) = {email} // email alone is unique // Primary Key SelectionPK(EMPLOYEE) = {emp_id} // Chosen as primary identifier // Alternate Keys (candidate keys not chosen as primary)AK(EMPLOYEE) = {{ssn}, {email}} // Composite Key ExampleENROLLMENT(student_id, course_id, semester, grade)PK(ENROLLMENT) = {student_id, course_id, semester} // Composite // Key Constraint in Schema NotationEMPLOYEE(emp_id*, ssn [UNIQUE], email [UNIQUE], name, dept_id, salary) // Formal Constraint Statement∀t₁, t₂ ∈ r(EMPLOYEE): t₁[emp_id] = t₂[emp_id] → t₁ = t₂ // PK uniqueness∀t₁, t₂ ∈ r(EMPLOYEE): t₁[ssn] = t₂[ssn] → t₁ = t₂ // UNIQUE constraint∀t₁, t₂ ∈ r(EMPLOYEE): t₁[email] = t₂[email] → t₁ = t₂ // UNIQUE constraint| Key Type | Notation | Property | Example |
|---|---|---|---|
| Superkey | SK(R) = X | Uniquely identifies (may not be minimal) | {emp_id, name} |
| Candidate Key | CK(R) = X | Minimal superkey | {emp_id} |
| Primary Key | PK(R) = X | Designated main identifier | {emp_id} |
| Alternate Key | AK(R) = X | Candidate key not chosen as PK | {ssn}, {email} |
| Composite Key | PK(R) = {A,B,C} | Multi-attribute key | {student_id, course_id} |
Referential integrity constraints ensure that foreign key values correspond to existing primary key values in referenced tables.
R.A → S.B or R[A] ⊆ S[B]
Meaning: Every value of attribute A in relation R must appear as a value of attribute B in relation S.
For FK constraint R.FK → S.PK:
∀t ∈ r(R): t[FK] ≠ NULL → ∃u ∈ r(S): t[FK] = u[PK]
Complete constraint specification includes update/delete actions:
R.A → S.B [ON DELETE action] [ON UPDATE action]
Where action ∈ {RESTRICT, CASCADE, SET NULL, SET DEFAULT, NO ACTION}
123456789101112131415161718192021222324252627282930313233343536373839404142
// REFERENTIAL INTEGRITY NOTATION// ================================ // Basic Foreign Key NotationORDER.customer_id → CUSTOMER.customer_idEMPLOYEE.dept_id → DEPARTMENT.dept_idORDER_ITEM.product_id → PRODUCT.product_id // Set Inclusion Notationπ_customer_id(ORDER) ⊆ π_customer_id(CUSTOMER)// All customer_ids in ORDER must exist in CUSTOMER // Complete Constraint SpecificationFK₁: ORDER.customer_id → CUSTOMER.customer_id ON DELETE RESTRICT ON UPDATE CASCADE FK₂: ORDER_ITEM.order_id → ORDER.order_id ON DELETE CASCADE ON UPDATE CASCADE FK₃: EMPLOYEE.manager_id → EMPLOYEE.emp_id // Self-referencing ON DELETE SET NULL ON UPDATE CASCADE // Nullable Foreign Key// Optionally references (NULL allowed)FK₄: EMPLOYEE.dept_id →? DEPARTMENT.dept_id -- dept_id can be NULL (employee may be unassigned) // Composite Foreign KeyORDER_ITEM(order_id, line_num, product_id, quantity)FK: ORDER_ITEM[order_id] → ORDER[order_id] ENROLLMENT(student_id, course_id, semester, grade)FK₁: ENROLLMENT[student_id] → STUDENT[student_id]FK₂: ENROLLMENT[course_id] → COURSE[course_id] // Constraint Verification Expression// "FK constraint ORDER.customer_id → CUSTOMER.customer_id is satisfied"∀t ∈ r(ORDER): t[customer_id] IS NULL ∨ t[customer_id] ∈ π_customer_id(CUSTOMER)Domain constraints restrict attribute values to valid ranges, while check constraints express arbitrary conditions on individual tuples.
dom(A) = D [WHERE condition]
Examples:
General format: CHECK(predicate) where predicate is a Boolean expression.
Relation-level: Constraint applies to all tuples in a relation.
Cross-relation: Constraint involves multiple relations (implemented via triggers).
12345678910111213141516171819202122232425262728293031323334353637383940
// DOMAIN AND CHECK CONSTRAINT NOTATION// ===================================== // Domain Constraintsdom(salary) = DECIMAL(10,2) WHERE salary ≥ 0dom(age) = INTEGER WHERE age ∈ [0, 150]dom(gender) = CHAR(1) WHERE gender ∈ {'M', 'F', 'X'}dom(rating) = DECIMAL(2,1) WHERE rating ∈ [0.0, 5.0]dom(percentage) = DECIMAL(5,2) WHERE 0 ≤ percentage ≤ 100 // Enumeration Domaindom(order_status) = {'pending', 'confirmed', 'shipped', 'delivered', 'cancelled'}dom(day_of_week) = {'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'} // Tuple-Level Check ConstraintsEMPLOYEE: CHECK(salary ≥ 0) CHECK(hire_date ≤ CURRENT_DATE) CHECK(birth_date < hire_date) // Must be born before hired ORDER: CHECK(total_amount ≥ 0) CHECK(ship_date IS NULL OR ship_date ≥ order_date) CHECK(status ∈ dom(order_status)) // Conditional ConstraintsEMPLOYEE: CHECK(salary ≥ min_salary) CHECK(title = 'Manager' → salary ≥ 80000) // Managers earn at least 80K CHECK(status = 'terminated' → termination_date IS NOT NULL) // Multi-Attribute ConstraintsEVENT: CHECK(end_date > start_date OR end_date IS NULL) CHECK(max_attendees IS NULL OR max_attendees > 0) CHECK(status = 'cancelled' → cancelled_by IS NOT NULL) // Formal Predicate Notation∀t ∈ r(EMPLOYEE): t[salary] ≥ 0∀t ∈ r(ORDER): t[ship_date] = NULL ∨ t[ship_date] ≥ t[order_date]Constraints operate at different granularities:
• Column-level: Domain constraints on single attributes • Tuple-level: CHECK constraints on individual rows • Table-level: UNIQUE, PRIMARY KEY constraints • Database-level: Foreign keys, cross-table assertions
Notation should clarify which level the constraint operates at.
Functional dependencies (FDs) express deterministic relationships between attribute sets. They are fundamental to normalization theory and database design.
X → Y (X functionally determines Y)
Meaning: If two tuples agree on X, they must agree on Y.
Formal definition: ∀t₁, t₂ ∈ r(R): t₁[X] = t₂[X] → t₁[Y] = t₂[Y]
F = {X₁ → Y₁, X₂ → Y₂, ...} — Set of FDs holding in relation R
F⁺ — Closure of F (all FDs derivable from F using Armstrong's axioms)
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// FUNCTIONAL DEPENDENCY NOTATION// ================================ // Schema: EMPLOYEE(emp_id, ssn, name, dept_id, dept_name, salary) // Individual FDsemp_id → ssn, name, dept_id, salary // emp_id determines allssn → emp_id, name, dept_id, salary // ssn also determines alldept_id → dept_name // department determines its name // FD Set NotationF = { emp_id → ssn, emp_id → name, emp_id → dept_id, emp_id → salary, ssn → emp_id, dept_id → dept_name} // Compact notation (grouping RHS)F = { emp_id → {ssn, name, dept_id, salary}, ssn → {emp_id, name, dept_id, salary}, dept_id → dept_name} // Closure of Attribute Set// X⁺ = set of all attributes functionally determined by X{emp_id}⁺ = {emp_id, ssn, name, dept_id, salary, dept_name}{dept_id}⁺ = {dept_id, dept_name} // Minimal Cover / Canonical Cover// Irreducible set of FDs equivalent to FF_min = { emp_id → ssn, emp_id → name, emp_id → dept_id, emp_id → salary, ssn → emp_id, dept_id → dept_name} // Trivial vs Non-Trivial FDsemp_id → emp_id // Trivial (Y ⊆ X){emp_id, name} → name // Trivialemp_id → name // Non-trivial (useful) // Transitive Dependency (for 3NF analysis)emp_id → dept_id → dept_name// emp_id transitively determines dept_name via dept_idMultivalued dependencies (MVDs) express independence between sets of attributes—relevant for 4NF analysis.
X ↠ Y (X multidetermines Y)
Meaning: For a fixed X value, the Y values are independent of R-X-Y values.
Formal definition: In R(X, Y, Z) where Z = R - X - Y: ∀t₁, t₂ ∈ r(R): t₁[X] = t₂[X] → ∃t₃ ∈ r(R): t₃[X] = t₁[X] ∧ t₃[Y] = t₁[Y] ∧ t₃[Z] = t₂[Z]
Every FD implies an MVD: X → Y implies X ↠ Y
But not vice versa: X ↠ Y does not imply X → Y
1234567891011121314151617181920212223242526272829303132333435363738
// MULTIVALUED DEPENDENCY NOTATION// ================================= // Schema: COURSE_INFO(course_id, instructor, textbook)// One course can have multiple instructors AND multiple textbooks// But instructors and textbooks are INDEPENDENT // MVD Notationcourse_id ↠ instructorcourse_id ↠ textbook // Equivalently (complementation rule)course_id ↠ instructor | textbook// The | indicates the MVDs partition the non-key attributes // Example Instance showing MVDr(COURSE_INFO) = { (CS101, 'Dr. Smith', 'Database Systems'), (CS101, 'Dr. Smith', 'SQL Fundamentals'), (CS101, 'Dr. Jones', 'Database Systems'), (CS101, 'Dr. Jones', 'SQL Fundamentals')}// Note: For CS101, EVERY instructor is paired with EVERY textbook// This is the hallmark of a multivalued dependency // MVD violations occur when pairs are incomplete// Violation example:r_bad = { (CS101, 'Dr. Smith', 'Database Systems'), (CS101, 'Dr. Jones', 'SQL Fundamentals')}// This violates course_id ↠ instructor because not all combinations exist // Join Dependency (generalization of MVD)// *(R₁, R₂, ..., Rₙ) means R decomposes losslessly into R₁, R₂, ..., Rₙ*(COURSE_INSTRUCTOR, COURSE_TEXTBOOK)where COURSE_INSTRUCTOR(course_id, instructor) COURSE_TEXTBOOK(course_id, textbook)Beyond structural constraints, databases must enforce complex business rules—semantic constraints that reflect organizational policies.
Database-wide constraints spanning multiple tables:
ASSERT constraint_name: predicate
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// COMPLEX BUSINESS RULE NOTATION// ================================= // CARDINALITY CONSTRAINTS// -----------------------// "Each department must have 1-10 employees"ASSERT dept_size: ∀d ∈ r(DEPARTMENT): 1 ≤ |{e ∈ r(EMPLOYEE) : e[dept_id] = d[dept_id]}| ≤ 10 // AGGREGATE CONSTRAINTS// ---------------------// "Order total must equal sum of line items"ASSERT order_total_correct: ∀o ∈ r(ORDER): o[total_amount] = Σ{li[quantity] × li[unit_price] : li ∈ r(ORDER_LINE) ∧ li[order_id] = o[order_id]} // "Total salaries per dept cannot exceed budget"ASSERT salary_budget: ∀d ∈ r(DEPARTMENT): Σ{e[salary] : e ∈ r(EMPLOYEE) ∧ e[dept_id] = d[dept_id]} ≤ d[budget] // EXISTENCE CONSTRAINTS// ---------------------// "Every manager must also be an employee"ASSERT manager_is_employee: ∀d ∈ r(DEPARTMENT): d[manager_id] IS NULL ∨ ∃e ∈ r(EMPLOYEE): e[emp_id] = d[manager_id] // TEMPORAL CONSTRAINTS// --------------------// "Project end_date must be after start_date"∀p ∈ r(PROJECT): p[end_date] IS NULL ∨ p[end_date] > p[start_date] // "Employee cannot be assigned to overlapping projects"∀a₁, a₂ ∈ r(ASSIGNMENT): a₁[emp_id] = a₂[emp_id] ∧ a₁[project_id] ≠ a₂[project_id] → INTERVAL(a₁[start], a₁[end]) ∩ INTERVAL(a₂[start], a₂[end]) = ∅ // CONDITIONAL CONSTRAINTS// -----------------------// "Gold customers get at least 10% discount"∀o ∈ r(ORDER): (∃c ∈ r(CUSTOMER): c[id] = o[customer_id] ∧ c[tier] = 'Gold') → o[discount_percent] ≥ 10 // "Hazardous products require special shipping"∀oi ∈ r(ORDER_ITEM): (∃p ∈ r(PRODUCT): p[id] = oi[product_id] ∧ p[hazardous] = TRUE) → (∃o ∈ r(ORDER): o[id] = oi[order_id] ∧ o[shipping_class] = 'Hazmat')Most complex business rules cannot be expressed as standard SQL constraints. They require triggers, stored procedures, or application-layer enforcement. However, formal notation in documentation ensures the rules are unambiguously specified regardless of implementation mechanism.
You can now express database constraints using formal notation—from simple key constraints to complex business rules. Next, we explore diagrammatic representation techniques that visualize schemas and relationships.