Requirements Gathering - Learning Module

Loading content...

0/252

Constraint Identification: Defining the Rules of Data Integrity

Constraints: The Guardians of Data Quality

Data without constraints is chaos. Imagine a database where customer ages can be negative, order totals can exceed available inventory, employees can report to themselves, and accounts can have balances that violate business policies. Such a system would be worse than useless—it would actively mislead.

Constraints are the rules that separate valid data from invalid data. They encode the business logic that ensures every record in the database represents something real, possible, and acceptable within the organization's context.

Constraint identification is the systematic process of discovering, documenting, and specifying these rules. It ensures that the database enforces data integrity—preventing errors at the source rather than detecting them after damage is done.

What You Will Learn

By the end of this page, you will understand how to identify and specify constraints for database design. You'll learn to recognize different constraint types, extract constraints from interviews and documents, formally specify constraints, and understand how constraints are implemented in database systems.

Understanding Constraints in Database Systems

A constraint is a rule that restricts the values that can be stored in a database. Constraints ensure that data conforms to business reality and organizational policies.

Why Constraints Matter:

Data Quality: Constraints prevent invalid data from entering the database. An age cannot be -5. A hire date cannot be in the future. An order cannot reference a non-existent customer.

Business Rule Enforcement: Constraints encode business policies. A credit limit cannot be exceeded. An employee cannot be their own manager. A product cannot be sold below minimum price.

Referential Integrity: Constraints maintain relationships between entities. Every order must belong to a valid customer. Every line item must reference a valid product.

Consistency Guarantees: Constraints prevent inconsistent states. If a warehouse is deleted, what happens to its inventory? Constraints define and enforce these rules.

Constraint Categories Overview
Category	Description	Examples
Domain Constraints	Restrict values to valid sets for an attribute	Status ∈ {Active, Inactive, Suspended}; Age > 0
Entity Constraints	Rules about individual entity instances	Every employee must have a unique ID; SSN format validation
Key Constraints	Uniqueness and identification rules	CustomerID is unique; OrderNumber is unique per year
Referential Constraints	Relationships between entities	Every Order must reference a valid Customer
Semantic Constraints	Business meaning and logic rules	EndDate must be after StartDate; ReorderPoint < MaxStock
Temporal Constraints	Time-based rules	RetirementDate > HireDate + 20 years; ContractEnd ≤ 5 years from ContractStart
Cardinality Constraints	Quantity limits on relationships	Each Department has 1-3 Managers; each Project has 2-10 Team Members

Constraints vs. Validation

There's a distinction between database constraints (enforced by the DBMS, cannot be violated) and application validation (enforced by code, can be bypassed). During requirements, capture all rules regardless of where they'll be implemented. The design phase decides what goes where.

Domain Constraints: Valid Value Sets

Domain constraints define the set of valid values for each attribute. They are the most fundamental constraints because they establish what data is acceptable at the most basic level.

Types of Domain Constraints:

Domain Constraint Types
Type	Description	Examples	SQL Implementation
Data Type	Basic type of value	Integer, String, Date, Boolean	Column data type declaration
Length	Size limits for text/binary	FirstName max 50 chars	VARCHAR(50)
Range	Numeric bounds	Quantity: 1-10000; Percentage: 0-100	CHECK (Quantity BETWEEN 1 AND 10000)
Enumeration	Fixed list of valid values	Status: {Draft, Active, Archived}	CHECK (Status IN ('Draft', 'Active', 'Archived'))
Pattern	Format specification	PhoneNumber: (NNN) NNN-NNNN	CHECK (Phone LIKE '([0-9]{3}) [0-9]{3}-[0-9]{4}')
Nullability	Whether NULL is allowed	CustomerName cannot be null	NOT NULL
Default	Value when not specified	CreatedDate defaults to today	DEFAULT GETDATE()

Constraint Extraction from Requirements:

Domain constraints emerge from various sources:

From Forms:

Dropdown menus define enumerations
Input field lengths suggest max character limits
Required field markers indicate NOT NULL
Input masks reveal format patterns

From Documents:

Data dictionaries specify types and lengths
Data standards define formats (ISO date, currency codes)
Policy documents constrain valid values

From Interviews:

"The quantity ordered must be at least 1"
"Status can be Pending, Approved, or Rejected"
"Email is required for all online customers"

From Existing Systems:

Legacy database column definitions
Application validation code
Error messages indicating rejected values

Domain Constraint Specification Template

•Attribute: CustomerAge
•Data Type: Integer
•Minimum Value: 0 (newborns for insurance)
•Maximum Value: 150 (practical upper bound)
•Nullability: Optional (may be unknown)
•Default: NULL (no assumption made)
•Business Rule: Must be provided for customers over 65 for senior discount

Enumerations Need Maintenance Plans

Enumerated domains (lists of valid values) often change over time. New statuses are added. Country codes change. Product categories evolve. Document not just today's valid values but also: Who can modify the list? What happens to existing data when values are removed? Consider lookup tables instead of hard-coded CHECK constraints.

Key Constraints: Uniqueness and Identity

Key constraints ensure that entity instances can be uniquely identified and that no two instances share the same identity.

Types of Key Constraints:

Key Constraint Types
Constraint Type	Description	Example
Primary Key	Main unique identifier for an entity; cannot be NULL	Customer.CustomerID is primary key
Unique	Alternative unique identifier; may be nullable	Customer.Email must be unique (when provided)
Composite Key	Combination of attributes that together are unique	(OrderID, LineNumber) is unique in OrderItem
Surrogate Key	System-generated unique identifier	Auto-increment ID, GUID
Natural Key	Business-meaningful unique identifier	ISBN for books, SSN for US persons

Complex Uniqueness Constraints:

Uniqueness is not always global. Some constraints are conditional:

Scoped Uniqueness:

ProductCode is unique within a Supplier (not globally)
Employee email is unique within a Department
Room number is unique within a Building

Implementation: Composite unique constraint on (ScopeKey, ValueColumn)

Temporal Uniqueness:

Only one active price for a product at any time
Only one current manager per department at any point

Implementation: Requires checking overlap of date ranges

Conditional Uniqueness:

SSN is unique BUT only when not null
External ID is unique BUT only for external customers

Implementation: Partial indexes or CHECK + UNIQUE combination

Key Constraint Specification Example

•Entity: Product
•Primary Key: ProductID (surrogate, auto-generated)
•Alternate Key 1: SKU (must be unique globally)
•Alternate Key 2: (SupplierID, SupplierProductCode) — each supplier's codes are unique
•Uniqueness Rule: UPC code is unique when provided (optional field)
•Constraint Note: Historical products may have duplicate names (uniqueness only for active products)

Case Sensitivity in Uniqueness

Don't overlook case sensitivity in uniqueness constraints. Is 'john@email.com' the same as 'John@Email.com'? For email addresses, typically yes (case-insensitive). For product codes, often no (case-sensitive). Document the intended behavior explicitly.

Referential Integrity Constraints

Referential integrity constraints maintain the consistency of relationships between entities. They ensure that references always point to valid, existing records.

The Core Principle:

If entity A references entity B (foreign key relationship), then:

The referenced record in B must exist (on insert/update of A)
When the referenced record in B is modified/deleted, A must be handled appropriately

Referential Action Options
Action	On Delete	On Update	Use Case
RESTRICT / NO ACTION	Prevent deletion if references exist	Prevent update if references exist	Protect critical data; force explicit handling
CASCADE	Delete referencing records	Update foreign key values	Dependent data that has no meaning without parent
SET NULL	Set foreign key to NULL	Set foreign key to NULL	Optional relationships; preserve orphaned data
SET DEFAULT	Set foreign key to default value	Set foreign key to default value	Reassign to default category/owner

Referential Constraint Specification:

For each relationship, document:

RELATIONSHIP: Order references Customer

Foreign Key: Order.CustomerID → Customer.CustomerID

Mandatory: Yes (every order must have a customer)

On Delete of Customer:
  - Action: RESTRICT
  - Reason: Cannot delete customers with order history
  - Alternative: Soft-delete customer, keep orders accessible

On Update of Customer.CustomerID:
  - Action: CASCADE (if surrogate key changes are possible)
  - Reason: Orders should follow customer identity
  - Note: Surrogate keys should not change; this is precautionary

Business Rule:
  - Orders can only be placed by Active customers
  - When customer becomes Inactive, existing orders remain valid

Complex Referential Scenarios:

Self-Referential:

Employee.ManagerID → Employee.EmployeeID
Must prevent cycles (A manages B manages A)

Multiple References:

Order.BillingAddressID → Address.AddressID
Order.ShippingAddressID → Address.AddressID
Different cascade rules may apply to each

Polymorphic References:

Comment.EntityType + Comment.EntityID can reference Order, Product, or Customer
Cannot be enforced by simple foreign key; requires application or trigger logic

The Orphan Prevention Strategy

Orphaned records—data that should reference something but doesn't—are a major data quality problem. Referential integrity prevents future orphans, but existing databases may already have them. During requirements, document both the constraint (prevent new orphans) and any cleanup needs (handle existing orphans).

Semantic Constraints: Business Rules and Logic

Semantic constraints encode business meaning and logic that goes beyond simple domain or referential rules. They ensure data makes sense in the context of the business.

Categories of Semantic Constraints:

Semantic Constraint Types

•Cross-Attribute Constraints — Rules involving multiple attributes of the same entity. Example: EndDate must be after StartDate
•Cross-Entity Constraints — Rules involving attributes from different entities. Example: Order.ShipDate cannot be before Customer.RegistrationDate
•Aggregate Constraints — Rules about totals or summaries. Example: Sum of line item amounts must equal order total; Sum of budget allocations cannot exceed department budget
•Conditional Constraints — Rules that apply only under certain conditions. Example: SSN required if CustomerType = 'Individual'; ApproverID required if ExpenseAmount > $1000
•State-Dependent Constraints — Rules based on entity status. Example: Shipped orders cannot be modified; Only pending applications can be cancelled

Semantic Constraint Examples:

CONSTRAINT: Order Total Consistency

Rule: The sum of OrderItem.LineTotal for an Order must 
      equal Order.SubTotal + Order.Tax + Order.Shipping - Order.Discount

Enforcement: Trigger or application logic
            (too complex for CHECK constraint)

Violation Action: Reject transaction; log discrepancy

---

CONSTRAINT: Manager Level Hierarchy

Rule: An employee's manager must have a higher Level than the employee
      (Level 1 reports to Level 2+; Level 2 reports to Level 3+)

Exception: CEO (highest level) has no manager (ManagerID is NULL)

Enforcement: Trigger on INSERT and UPDATE of Employee

---

CONSTRAINT: Inventory Availability

Rule: Cannot ship more than available inventory
      Shipment.Quantity ≤ Product.QuantityOnHand - Product.QuantityReserved

Enforcement: CHECK constraint with subquery (if supported) 
             or trigger with reservation system

Concurrency Note: Requires locking or optimistic concurrency control

Semantic Constraint Documentation Template
Element	Description
Constraint Name	Unique identifier for the constraint
Description	Plain English explanation of the rule
Formal Expression	Logical or mathematical formulation
Entities Involved	Which entities/tables are affected
Trigger Events	What operations activate the constraint (INSERT, UPDATE, DELETE)
Enforcement Level	Database constraint, trigger, or application code
Violation Response	Reject operation, warn, log, or other action
Exceptions	Conditions where the rule doesn't apply
Source	Where this requirement came from (interview, policy, regulation)

Complex Constraints Are Expensive

Complex semantic constraints that span multiple tables or require aggregation are expensive to enforce. Every INSERT, UPDATE, or DELETE may require additional queries. Document the expected frequency of constraint-related operations and discuss performance implications during design.

Temporal Constraints: Time-Based Rules

Temporal constraints govern the time-related aspects of data—validity periods, sequences, durations, and date-based rules.

Types of Temporal Constraints:

Temporal Constraint Categories
Category	Description	Examples
Sequence Constraints	Dates must occur in a specific order	StartDate < EndDate; HireDate < TerminationDate
Duration Constraints	Time spans must fall within limits	Contract duration: 1-5 years; Trial period: max 90 days
Currency Constraints	Rules about current vs. historical data	Only one active price per product at a time
Future/Past Constraints	Rules about dates relative to now	BirthDate must be in the past; ScheduledDate must be in the future
Calendar Constraints	Rules about specific dates/times	Appointments only on business days; No deliveries on holidays
Overlap Constraints	Prevent or require overlapping periods	Room bookings cannot overlap; Employee assignments must have no gaps

Temporal Overlap Prevention:

A common requirement is preventing overlapping time periods. For example:

Only one room booking at a time
Only one price effective for a product at a time
Only one assignment per employee at a time

Non-Overlap Constraint:

For records with (EntityID, StartDate, EndDate), ensure no two records for the same EntityID have overlapping periods:

Two periods (S1, E1) and (S2, E2) overlap if: S1 < E2 AND S2 < E1

Implementation Approaches:

Trigger that queries for overlaps before insert/update
EXCLUDE constraint (PostgreSQL) with range types
Application-level validation

Temporal Validity Tracking:

Many entities track when facts are valid:

Transaction Time: When the fact was recorded in the database
Valid Time: When the fact was true in the real world

A bi-temporal database tracks both, enabling queries like:

"What was the customer's address on January 1st?" (valid time)
"What address did we have on file when we processed this order?" (transaction time)

Temporal Constraint Specification Example

•Entity: ProductPrice
•Temporal Fields: EffectiveFrom (DATE), EffectiveTo (DATE nullable)
•Sequence Constraint: EffectiveFrom < EffectiveTo (when both populated)
•Non-Overlap Constraint: No two prices for same product have overlapping effective periods
•Currency Rule: Exactly one price per product where EffectiveTo is NULL (current price)
•Gap Allowance: Gaps are permitted (product may be unavailable during gap)
•Future Pricing: Prices can be created for future effective dates

Timezone Considerations

For global systems, temporal constraints must account for timezones. 'Before 5:00 PM' depends on which timezone. Document whether times are stored in UTC, local time, or timezone-aware formats, and how timezone conversions affect constraint validation.

Cardinality and Participation Constraints

Cardinality constraints specify the minimum and maximum number of entity instances that can participate in a relationship.

Cardinality Notation:

Cardinality is expressed as (min, max) where:

min = minimum number of related instances (0 = optional, 1+ = mandatory)
max = maximum number of related instances (1 = single, N or * = unlimited, or specific number)

Common Cardinality Patterns
Cardinality	Meaning	Example
(0,1) : (0,1)	Optional one-to-one	Person optionally has Passport; Passport optionally has Person
(1,1) : (0,N)	Mandatory one, optional many	Each Order has exactly one Customer; Customer may have zero or more Orders
(0,N) : (0,N)	Optional many-to-many	Products may be in Categories; Categories may contain Products
(1,1) : (1,N)	Mandatory on both sides	Each OrderItem belongs to exactly one Order; Each Order has at least one OrderItem
(1,3) : (0,N)	Specific range	Each Project has 1-3 Managers; Managers may manage any number of Projects

Enforcing Minimum Cardinality:

Maximum cardinality of 1 is straightforward (unique constraint). Minimum cardinality > 0 (mandatory participation) is more challenging:

For 'each X must have at least one Y':

Cannot simply use NOT NULL on foreign key (that's wrong direction)
Requires either:
- Deferred constraint checking (validate after transaction commits)
- Trigger that prevents deleting the last Y or requires Y creation with X
- Application logic enforcement

Example Constraint:

CONSTRAINT: Order Must Have Items

Rule: Every Order must have at least one OrderItem

Problem: How to insert Order before OrderItems exist?

Solution 1: Deferred constraint
  - Begin transaction
  - Insert Order
  - Insert OrderItems
  - Commit (constraint checked here)

Solution 2: Business rule enforcement
  - Application ensures Order + first Item created together
  - Prevent deleting last OrderItem
  - Prevent creating Order without Items

Cardinality Documentation

For each relationship, document both the inherent cardinality (can there be multiple?) and the business cardinality (should there be limits?). A one-to-many relationship might have a business rule like 'maximum 5 addresses per customer' even though technically unlimited addresses are possible.

Identifying Constraints from Requirements Sources

Constraints are embedded throughout requirements sources. Systematic extraction ensures none are missed.

Constraint Indicators by Source
Source	Look For	Constraint Example
Interview Notes	Words like 'must', 'cannot', 'always', 'never', 'only if', 'at least', 'no more than'	"Customers must have an email address"
Forms	Required field markers, input validation, dropdown values, format masks	Asterisk on field → NOT NULL constraint
Reports	Grouping assumptions, filter expectations, data relationships	Report groups by region → Region is required attribute
Policies	Rules, restrictions, approvals, limits	"Expenses over $500 require manager approval"
Regulations	Compliance mandates, retention rules, privacy requirements	"Medical records retained for 7 years minimum"
Error Messages	Current system rejection criteria	"Error: End date must be after start date"
Exception Procedures	What happens when rules are violated	"If inventory negative, create backorder"

Constraint Extraction Techniques:

Trigger Word Analysis: Scan documents for constraint indicator words:

"Must" / "Must not" → Mandatory constraint
"Should" → Suggested but not enforced (clarify if it becomes a constraint)
"Cannot" / "Never" → Prohibition constraint
"Only when" / "Only if" → Conditional constraint
"At least" / "At most" → Cardinality or range constraint
"Before" / "After" → Temporal sequence constraint
"Unique" / "One per" → Uniqueness constraint

Negative Case Inquiry: During interviews, ask explicitly:

"What values would be invalid for this field?"
"What scenarios should the system prevent?"
"What happens if someone tries to do X?"
"Are there exceptions to this rule?"

Edge Case Exploration:

What if the quantity is zero? Negative?
What if the dates are equal?
What if the reference doesn't exist?
What if there are no related records?

Implicit Constraints Are Dangerous

Many constraints are assumed rather than stated. Users assume 'of course dates can't be in the future' or 'obviously you can't delete a customer with orders.' Make implicit constraints explicit by asking 'what if?' questions for every attribute and relationship.

Documenting Constraints Formally

Constraints must be documented in a structured format that enables clear communication, validation, and eventual implementation.

Constraint Catalog Structure

•Constraint ID: Unique identifier (e.g., C-001, BR-CUST-001)
•Name: Descriptive name (e.g., 'Customer Email Required')
•Category: Domain, Key, Referential, Semantic, Temporal, Cardinality
•Description: Plain English explanation of the rule
•Formal Expression: Logical notation, pseudo-code, or SQL
•Entities/Attributes: What data elements are constrained
•Enforcement Level: Database, Application, or Manual
•Violation Response: Reject, Warn, Log, Auto-correct
•Source: Who/what established this requirement
•Priority: Critical, High, Medium, Low (for implementation prioritization)
•Exceptions: When the rule doesn't apply
•Test Cases: How to verify the constraint works

Example Constraint Documentation:

CONSTRAINT: BR-ORD-003

Name: Order Cannot Exceed Credit Limit

Category: Semantic Constraint

Description:
  The total value of a customer's open orders (not yet paid) 
  cannot exceed their assigned credit limit.

Formal Expression:
  FOR ALL Customer c:
    SUM(Order.Total WHERE Order.CustomerID = c.CustomerID 
                    AND Order.Status IN ('Pending', 'Confirmed', 'Shipped'))
    ≤ c.CreditLimit

Entities Involved:
  - Customer (CreditLimit attribute)
  - Order (Total, Status, CustomerID attributes)

Enforcement Level:
  - Database trigger on Order INSERT and UPDATE
  - Application pre-check before order submission

Violation Response:
  - Reject order creation
  - Return error message: "Order exceeds available credit. 
    Available credit: $X. Order total: $Y."

Exceptions:
  - Customers with CreditLimit = NULL have no limit (premium customers)
  - Manager override with documented approval (log override)

Source:
  - Finance Policy FP-2023-04, Section 3.2
  - Interview: CFO, 2024-01-15

Priority: Critical

Test Cases:
  - Order within limit → Accept
  - Order exactly at limit → Accept
  - Order $0.01 over limit → Reject
  - Customer with no limit → Always accept
  - Manager override → Accept with logging

Constraint Traceability

Every constraint should trace back to a source (regulation, policy, interview) and forward to implementation (database constraint, trigger, application code). This traceability ensures constraints aren't lost during design and enables impact analysis when business rules change.

Summary: Constraints Complete the Requirements Picture

Constraint identification completes the requirements gathering phase. With data requirements, functional requirements, and constraints, you have a comprehensive specification of what the database must store, what it must do, and what rules it must enforce.

Key Takeaways

•Constraints ensure data integrity — They prevent invalid data from entering the database and maintain consistency.
•Domain constraints define valid values — Data types, ranges, enumerations, patterns, and nullability.
•Key constraints ensure uniqueness — Primary keys, unique constraints, and complex uniqueness rules.
•Referential integrity maintains relationships — Foreign keys with appropriate cascade/restrict actions.
•Semantic constraints encode business logic — Cross-attribute, cross-entity, and aggregate rules.
•Temporal constraints govern time — Sequences, durations, overlaps, and validity periods.
•Cardinality constraints limit participation — Minimum and maximum counts in relationships.
•Systematic extraction finds all constraints — From interviews, documents, forms, policies, and regulations.
•Formal documentation enables implementation — Structured constraint catalogs with all relevant details.

Module Conclusion:

With this page, you have completed the Requirements Gathering module. You now understand how to:

Conduct user interviews — Engaging stakeholders to capture tacit knowledge and business context
Analyze documents — Extracting requirements from forms, reports, policies, and existing systems
Specify data requirements — Defining entities, attributes, domains, and relationships
Specify functional requirements — Documenting operations, queries, reports, and integrations
Identify constraints — Capturing business rules, validation requirements, and integrity constraints

These skills form the foundation for all subsequent database design activities. A well-gathered set of requirements prevents costly errors, reduces rework, and ensures the final database serves its intended purpose.

Module Complete

Congratulations! You have completed the Requirements Gathering module. You now possess the skills to systematically gather, document, and validate database requirements. The next modules in the Database Design Process will show you how to transform these requirements into conceptual, logical, and physical database designs.