Constraints In Ddl - Learning Module

Loading content...

0/252

NOT NULL Constraint

The Simplest Yet Most Profound Constraint

Among all SQL constraints, NOT NULL appears the simplest: it merely ensures a column cannot contain NULL values. Yet this deceptively simple constraint addresses one of the most philosophically complex aspects of database design—the meaning and handling of missing, unknown, or inapplicable information.

NULL is not a value in the traditional sense. It represents the absence of a value—a distinction that propagates through every query, comparison, and aggregation in subtle ways that trip up even experienced developers.

Understanding when to require values (NOT NULL) versus when to permit their absence (nullable) is fundamental to creating schemas that accurately model reality and avoid the many pitfalls of NULL-handling in SQL.

What You Will Master

By the end of this page, you will deeply understand what NULL represents, why NOT NULL constraints prevent entire categories of bugs, how SQL's three-valued logic works, and how to make thoughtful decisions about which columns should permit NULL values and which should not.

Understanding NULL

Before understanding NOT NULL, we must understand NULL itself. NULL is one of the most misunderstood concepts in database systems.

What NULL Represents:

NULL in SQL can represent several situations:

Unknown: The value exists but we don't know it (e.g., a customer's birth date we haven't collected)
Inapplicable: The value doesn't make sense in context (e.g., spouse name for single persons)
Missing: The value should exist but wasn't provided (e.g., optional fields left blank)
Not Yet Defined: The value will be determined later (e.g., end date for ongoing events)

The ambiguity of NULL—it conflates all these meanings into one marker—is both a feature and a source of complexity.

NULL Is Not a Value:

This is crucial: NULL is not equal to zero, not equal to an empty string, and not even equal to itself. NULL is a marker indicating the absence of a value.

null_behavior.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- NULL is not equal to anything, including itself
 
SELECT NULL = NULL;        -- Returns NULL, not TRUE!
SELECT NULL <> NULL;       -- Returns NULL, not TRUE!
SELECT NULL = 0;           -- Returns NULL
SELECT NULL = '';          -- Returns NULL
 
-- The only way to test for NULL
SELECT * FROM employees WHERE middle_name IS NULL;      -- Correct
SELECT * FROM employees WHERE middle_name = NULL;       -- WRONG: Never returns rows!
 
-- IS NOT NULL to test for presence of value
SELECT * FROM employees WHERE phone_number IS NOT NULL;
 
 
-- NULL in comparisons
SELECT * FROM products WHERE price > 100;
-- If price is NULL for some rows, those rows are NOT included
-- NULL is neither > 100 nor <= 100
 
-- NULL in expressions propagates
SELECT 5 + NULL;           -- Returns NULL
SELECT CONCAT('Hello ', NULL);  -- Returns NULL (in standard SQL; varies by DBMS)
SELECT NULL AND TRUE;      -- Returns NULL
SELECT NULL OR TRUE;       -- Returns TRUE (special case)

The Cardinal Sin: Using = NULL

WHERE column = NULL never matches any rows, because NULL = NULL evaluates to NULL (which is not TRUE). This is one of the most common SQL bugs. Always use IS NULL and IS NOT NULL for NULL testing.

Three-Valued Logic:

Because of NULL, SQL uses three-valued logic instead of traditional Boolean logic. Every Boolean expression evaluates to one of: TRUE, FALSE, or UNKNOWN (NULL).

A	B	A AND B	A OR B	NOT A
TRUE	TRUE	TRUE	TRUE	FALSE
TRUE	FALSE	FALSE	TRUE	FALSE
TRUE	NULL	NULL	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE
FALSE	NULL	FALSE	NULL	TRUE
NULL	NULL	NULL	NULL	NULL

WHERE clauses include only rows where the condition evaluates to TRUE—not FALSE, and not NULL. This is why NULLs are often silently excluded from query results.

The NOT NULL Constraint

The NOT NULL constraint is straightforward: it prohibits NULL values in a column. Any INSERT or UPDATE that would set the column to NULL is rejected with an error.

Syntax:

NOT NULL is specified as part of the column definition. Unlike PRIMARY KEY, UNIQUE, FOREIGN KEY, and CHECK constraints, NOT NULL cannot be specified at the table level—it's always inline with the column.

not_null_syntax.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- NOT NULL in column definition
CREATE TABLE customers (
    customer_id     INT NOT NULL,
    first_name      VARCHAR(50) NOT NULL,      -- Required
    last_name       VARCHAR(50) NOT NULL,      -- Required
    email           VARCHAR(100) NOT NULL,     -- Required
    phone_number    VARCHAR(20),               -- Optional (NULL allowed)
    middle_name     VARCHAR(50),               -- Optional (NULL allowed)
    date_of_birth   DATE,                      -- Optional (NULL allowed)
    
    CONSTRAINT pk_customers PRIMARY KEY (customer_id)
);
 
-- This insert succeeds
INSERT INTO customers (customer_id, first_name, last_name, email)
VALUES (1, 'Alice', 'Smith', 'alice@example.com');
 
-- This insert fails: first_name cannot be NULL
INSERT INTO customers (customer_id, first_name, last_name, email)
VALUES (2, NULL, 'Jones', 'bob@example.com');
-- Error: null value in column "first_name" violates not-null constraint
 
-- This insert fails: email is required
INSERT INTO customers (customer_id, first_name, last_name)
VALUES (3, 'Carol', 'Williams');
-- Error: null value in column "email" violates not-null constraint
 
 
-- Update that violates NOT NULL also fails
UPDATE customers SET last_name = NULL WHERE customer_id = 1;
-- Error: null value in column "last_name" violates not-null constraint

NOT NULL and DEFAULT Values:

NOT NULL columns can have DEFAULT values, which are used when an INSERT doesn't specify a value. This combination ensures a column is never NULL while providing a sensible fallback.

not_null_defaults.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- NOT NULL with DEFAULT provides guaranteed values
CREATE TABLE orders (
    order_id        SERIAL PRIMARY KEY,
    customer_id     INT NOT NULL,
    order_date      TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    status          VARCHAR(20) NOT NULL DEFAULT 'pending',
    is_expedited    BOOLEAN NOT NULL DEFAULT FALSE,
    total_amount    DECIMAL(12,2) NOT NULL DEFAULT 0.00,
    
    CONSTRAINT fk_orders_customer 
        FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
 
-- Insert without specifying columns that have defaults
INSERT INTO orders (customer_id) VALUES (1);
-- Result: order_date = now, status = 'pending', is_expedited = FALSE, total_amount = 0.00
 
-- Explicit values override defaults
INSERT INTO orders (customer_id, status, total_amount)
VALUES (1, 'processing', 150.00);
 
 
-- Common anti-pattern: Using '' or 0 instead of NULL for optional fields
-- DON'T DO THIS:
CREATE TABLE bad_customers (
    customer_id     INT PRIMARY KEY,
    email           VARCHAR(100) NOT NULL,
    phone_number    VARCHAR(20) NOT NULL DEFAULT ''  -- Empty string is not the same as "unknown"
);
 
-- Better: Allow NULL for genuinely optional fields
CREATE TABLE better_customers (
    customer_id     INT PRIMARY KEY,
    email           VARCHAR(100) NOT NULL,
    phone_number    VARCHAR(20)  -- NULL means "unknown/not provided"
);

NULL vs Empty String

NULL means 'no value / unknown'. Empty string ('') means 'value is present, and it's an empty string'. These are different concepts. A phone number of NULL means 'we don't have it'. An empty phone number means 'the field was filled with nothing' (usually an application bug). Choose which semantic you need.

Adding and Removing NOT NULL Constraints

Modifying NOT NULL constraints on existing tables requires careful handling of existing data.

Adding NOT NULL to Existing Column:

alter_not_null.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
-- Adding NOT NULL to an existing column
 
-- Step 1: Find rows with NULL values
SELECT * FROM customers WHERE phone_number IS NULL;
 
-- Step 2: Decide how to handle existing NULLs:
 
-- Option A: Update NULLs to a default value
UPDATE customers 
SET phone_number = 'NOT PROVIDED' 
WHERE phone_number IS NULL;
 
-- Option B: Update NULLs to a computed/fetched value
UPDATE customers c
SET phone_number = (
    SELECT phone FROM contact_import WHERE email = c.email
)
WHERE phone_number IS NULL;
 
-- Option C: Delete rows with NULLs (if appropriate)
DELETE FROM customers WHERE phone_number IS NULL;
 
-- Step 3: Add the NOT NULL constraint
-- PostgreSQL syntax:
ALTER TABLE customers 
ALTER COLUMN phone_number SET NOT NULL;
 
-- SQL Server syntax:
ALTER TABLE customers 
ALTER COLUMN phone_number VARCHAR(20) NOT NULL;
 
-- MySQL syntax:
ALTER TABLE customers 
MODIFY COLUMN phone_number VARCHAR(20) NOT NULL;
 
-- Oracle syntax:
ALTER TABLE customers 
MODIFY phone_number NOT NULL;
 
 
-- If NULLs still exist, ALTER fails:
-- Error: column "phone_number" contains null values
 
 
-- Removing NOT NULL (making column nullable)
-- PostgreSQL:
ALTER TABLE customers ALTER COLUMN phone_number DROP NOT NULL;
 
-- SQL Server:
ALTER TABLE customers ALTER COLUMN phone_number VARCHAR(20) NULL;
 
-- MySQL:
ALTER TABLE customers MODIFY COLUMN phone_number VARCHAR(20) NULL;
 
-- Oracle:
ALTER TABLE customers MODIFY phone_number NULL;

Adding NOT NULL Is a Data Migration

Adding NOT NULL to a column with existing NULLs will fail. You must first update or delete the NULL values. For large tables, this update can be expensive—plan for maintenance windows and test on staging first.

PostgreSQL: Adding NOT NULL with Concurrent Operations:

PostgreSQL 11+ supports a pattern that's less disruptive for production:

pg_not_null_concurrency.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- PostgreSQL: Adding NOT NULL constraint more safely
 
-- Step 1: Add a CHECK constraint as NOT VALID (no existing row scan)
ALTER TABLE customers 
ADD CONSTRAINT check_phone_not_null 
CHECK (phone_number IS NOT NULL) NOT VALID;
 
-- Step 2: Validate the constraint (scans table, but allows other operations)
ALTER TABLE customers 
VALIDATE CONSTRAINT check_phone_not_null;
 
-- Step 3: Optionally, convert to actual NOT NULL 
-- (PostgreSQL 12+ recognizes equivalent CHECK as NOT NULL)
ALTER TABLE customers 
ALTER COLUMN phone_number SET NOT NULL;
 
-- Step 4: Drop the redundant CHECK constraint
ALTER TABLE customers 
DROP CONSTRAINT check_phone_not_null;
 
 
-- Why this matters:
-- Plain ALTER ... SET NOT NULL locks the table and scans all rows
-- The NOT VALID / VALIDATE pattern allows concurrent reads/writes
-- Better for large production tables

Designing for Nullability

Deciding which columns should be nullable versus NOT NULL is a key design decision. Here's a framework for thinking about it:

The Fundamental Question:

For each column, ask: "Is it ever valid for this row to exist without a value here?"

If never: Use NOT NULL
If sometimes: Allow NULL
If it depends: Consider table normalization or CHECK constraints

Guidelines for NOT NULL:

When to Use NOT NULL

•Primary Keys: Always NOT NULL (implicit in most databases)
•Foreign Keys for Mandatory Relationships: If the relationship must exist (order must have customer), the FK should be NOT NULL
•Essential Entity Attributes: Core identifying/descriptive attributes (product name, user email, order date)
•Status/Type Columns: Every row should have a defined state
•Audit Columns: created_at, created_by should always be populated
•Computed/Derived Values with Defaults: If a sensible default exists, use NOT NULL with DEFAULT

When to Allow NULL

•Optional Information: Middle name, secondary phone, secondary email
•Inapplicable Attributes: Spouse name (for single people), retirement date (for current employees)
•Optional Relationships: Employee may or may not have a manager
•Future/Pending Values: End date for ongoing subscriptions, resolution for open tickets
•External Data Not Always Available: Third-party IDs, optional API responses

nullability_design.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
-- Well-designed nullability
 
CREATE TABLE employees (
    employee_id     SERIAL PRIMARY KEY,                        -- PK: implicit NOT NULL
    
    -- Essential identifying information: NOT NULL
    first_name      VARCHAR(50) NOT NULL,
    last_name       VARCHAR(50) NOT NULL,
    email           VARCHAR(100) NOT NULL,
    hire_date       DATE NOT NULL,
    department_id   INT NOT NULL,                              -- Mandatory relationship
    
    -- Optional personal information: NULL allowed
    middle_name     VARCHAR(50),                               -- Some people don't have one
    personal_phone  VARCHAR(20),                               -- Not required
    preferred_name  VARCHAR(50),                               -- Optional nickname
    
    -- Optional organizational info: NULL allowed
    manager_id      INT,                                       -- CEO has no manager
    desk_location   VARCHAR(50),                               -- Remote workers have none
    
    -- Status with sensible default: NOT NULL
    is_active       BOOLEAN NOT NULL DEFAULT TRUE,
    
    -- Future/pending values: NULL allowed
    termination_date DATE,                                     -- NULL while employed
    
    -- Audit info: NOT NULL with defaults
    created_at      TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at      TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    
    CONSTRAINT fk_employees_dept 
        FOREIGN KEY (department_id) REFERENCES departments(department_id),
    CONSTRAINT fk_employees_manager 
        FOREIGN KEY (manager_id) REFERENCES employees(employee_id)
);
 
 
-- Table for modeling "pending" values explicitly (alternative to NULL)
CREATE TABLE employee_future_assignments (
    assignment_id   SERIAL PRIMARY KEY,
    employee_id     INT NOT NULL REFERENCES employees(employee_id),
    new_dept_id     INT NOT NULL REFERENCES departments(department_id),
    effective_date  DATE NOT NULL,
    status          VARCHAR(20) NOT NULL DEFAULT 'pending'
);
-- Avoids NULL in main table; pending states tracked explicitly

The Conservative Default

Default to NOT NULL and only allow NULL when you have a clear reason. It's easier to relax a NOT NULL constraint later than to add one after NULLs have polluted your data. Strict constraints catch application bugs early.

NULL Handling in Queries

When columns can contain NULL, queries must handle them carefully to avoid unexpected results.

NULL in Aggregations:

null_in_aggregations.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- NULLs in aggregate functions
 
-- Sample data
CREATE TABLE sales (
    sale_id     INT PRIMARY KEY,
    amount      DECIMAL(10,2),
    discount    DECIMAL(10,2)
);
 
INSERT INTO sales VALUES 
    (1, 100.00, 10.00),
    (2, 200.00, NULL),   -- No discount
    (3, 150.00, 20.00),
    (4, NULL, 5.00);     -- Amount unknown
 
-- COUNT behavior
SELECT COUNT(*) FROM sales;           -- Returns 4 (counts all rows)
SELECT COUNT(amount) FROM sales;      -- Returns 3 (excludes NULL amounts)
SELECT COUNT(discount) FROM sales;    -- Returns 3 (excludes NULL discounts)
 
-- SUM/AVG ignore NULLs
SELECT SUM(amount) FROM sales;        -- Returns 450.00 (NULLs ignored)
SELECT AVG(amount) FROM sales;        -- Returns 150.00 (450/3, not 450/4)
 
-- MIN/MAX ignore NULLs
SELECT MIN(discount), MAX(discount) FROM sales;  -- 5.00 and 20.00
 
-- NULL propagates in expressions
SELECT sale_id, amount - discount AS net_amount
FROM sales;
-- Row 2: 200 - NULL = NULL (not 200!)
-- Row 4: NULL - 5 = NULL
 
-- Use COALESCE to provide default for NULLs
SELECT sale_id, 
       COALESCE(amount, 0) - COALESCE(discount, 0) AS net_amount
FROM sales;
-- Row 2: 200 - 0 = 200
-- Row 4: 0 - 5 = -5

COALESCE, NULLIF, and CASE:

SQL provides functions to handle NULL values in expressions:

null_handling_functions.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
-- COALESCE: Return first non-NULL value
 
-- Basic usage: provide default
SELECT COALESCE(phone_number, 'No phone on file') FROM customers;
 
-- Multiple fallbacks
SELECT COALESCE(mobile_phone, work_phone, home_phone, 'No phone') 
FROM contacts;
 
-- In calculations
SELECT order_id, 
       subtotal + COALESCE(tax, 0) + COALESCE(shipping, 0) AS total
FROM orders;
 
 
-- NULLIF: Return NULL if two values are equal
-- Useful for converting sentinel values to NULL
 
-- Convert empty strings to NULL
SELECT NULLIF(middle_name, '') FROM customers;
-- If middle_name = '', returns NULL; otherwise returns middle_name
 
-- Avoid division by zero (return NULL instead)
SELECT total_sales / NULLIF(num_transactions, 0) AS avg_sale
FROM daily_reports;
-- If num_transactions = 0, NULLIF returns NULL, making division result NULL
 
 
-- CASE for explicit NULL handling
SELECT 
    customer_name,
    CASE 
        WHEN phone IS NULL THEN 'No phone provided'
        WHEN phone = '' THEN 'Empty phone (data issue)'
        ELSE phone
    END AS phone_display
FROM customers;
 
 
-- NVL (Oracle) / ISNULL (SQL Server) - database-specific alternatives
-- Oracle:
SELECT NVL(phone, 'Unknown') FROM customers;
 
-- SQL Server:
SELECT ISNULL(phone, 'Unknown') FROM customers;
 
-- PostgreSQL-specific: Treating NULL as empty array/object in JSONB
SELECT COALESCE(preferences, '{}') FROM users;

NULL in WHERE Clauses and NOT IN:

null_where_gotchas.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
-- The NOT IN trap with NULL values
 
CREATE TABLE assignments (
    id      INT PRIMARY KEY,
    user_id INT  -- NULL means unassigned
);
 
INSERT INTO assignments VALUES (1, 100), (2, 200), (3, NULL);
 
-- Find users NOT in assignments table
CREATE TABLE all_users AS 
SELECT * FROM (VALUES (100), (200), (300)) AS t(user_id);
 
-- WRONG: This returns nothing if any user_id is NULL!
SELECT * FROM all_users 
WHERE user_id NOT IN (SELECT user_id FROM assignments);
-- Returns: NOTHING (even though 300 should match)
-- Why: NOT IN (100, 200, NULL) = (NOT 100) AND (NOT 200) AND (NOT NULL)
--      = TRUE AND TRUE AND NULL = NULL (not TRUE, so not included)
 
-- CORRECT: Exclude NULLs from subquery
SELECT * FROM all_users 
WHERE user_id NOT IN (
    SELECT user_id FROM assignments WHERE user_id IS NOT NULL
);
-- Returns: 300
 
-- BETTER: Use NOT EXISTS (handles NULL safely)
SELECT * FROM all_users u
WHERE NOT EXISTS (
    SELECT 1 FROM assignments a WHERE a.user_id = u.user_id
);
-- Returns: 300 (NULL in assignments doesn't cause issues)
 
-- ALSO GOOD: LEFT JOIN with NULL check
SELECT u.*
FROM all_users u
LEFT JOIN assignments a ON u.user_id = a.user_id
WHERE a.user_id IS NULL;
-- Returns: 300

NOT IN + NULL = Silent Failure

If a subquery in NOT IN can contain NULL, the entire NOT IN predicate returns NULL (no rows). This is one of the most insidious SQL bugs. Prefer NOT EXISTS or LEFT JOIN WHERE IS NULL patterns that handle NULL safely.

NOT NULL and Index Efficiency

NULL values affect index storage, scan efficiency, and query optimization. Understanding these effects helps make informed nullability decisions.

NULL Storage in Indexes:

Different databases handle NULL in indexes differently:

Database	NULL Values in Index	Notes
PostgreSQL	Included (since 8.3)	Partial indexes can exclude NULLs
MySQL/InnoDB	Included	Single NULL allowed per UNIQUE index
SQL Server	Included	Filtered indexes can exclude NULLs
Oracle	NOT Included (B-tree)	Major implication: cannot index 'IS NULL'

Oracle's Special NULL Handling:

Oracle does NOT store NULL values in regular B-tree indexes. This has significant implications:

oracle_null_index.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- Oracle: Index on nullable column
 
CREATE TABLE orders (
    order_id        NUMBER PRIMARY KEY,
    customer_id     NUMBER NOT NULL,
    fulfilled_date  DATE       -- NULL until order is fulfilled
);
 
CREATE INDEX idx_orders_fulfilled ON orders(fulfilled_date);
 
-- This query CANNOT use the index (searching for something not in index)
SELECT * FROM orders WHERE fulfilled_date IS NULL;
-- Result: Full table scan (slow for large tables)
 
-- This query CAN use the index
SELECT * FROM orders WHERE fulfilled_date = DATE '2024-01-15';
-- Result: Index range scan (fast)
 
 
-- Workaround 1: Include constant to force NULL rows into index
CREATE INDEX idx_orders_fulfilled_fix ON orders(fulfilled_date, 0);
-- Now NULLs are stored (with the 0 constant)
 
SELECT * FROM orders WHERE fulfilled_date IS NULL;
-- Can use index now (index on fulfilled_date, 0)
 
 
-- Workaround 2: Function-based index with NVL
CREATE INDEX idx_orders_fulfilled_nvl ON orders(NVL(fulfilled_date, DATE '1900-01-01'));
 
SELECT * FROM orders WHERE NVL(fulfilled_date, DATE '1900-01-01') = DATE '1900-01-01';
-- Must use the same NVL expression in queries
 
 
-- PostgreSQL: Partial index excluding NULLs (more efficient)
CREATE INDEX idx_orders_fulfilled_partial 
ON orders(fulfilled_date) 
WHERE fulfilled_date IS NOT NULL;
-- Smaller index; only stores rows where value exists

Query Optimizer Considerations:

NOT NULL constraints provide information to the query optimizer:

Simpler Predicates: Optimizer knows it doesn't need to handle NULL cases
Join Optimization: NOT NULL FKs guarantee parent existence, enabling certain optimizations
Index Selection: Statistics on non-nullable columns are more predictable
Aggregation Planning: No need for NULL-handling in aggregates

not_null_optimizer.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- NOT NULL helps optimizer make better decisions
 
-- With NOT NULL, optimizer knows:
-- - COUNT(column) = COUNT(*) (no NULLs to skip)
-- - Joins won't produce NULL-extended rows
-- - IN predicates work correctly
 
-- Example: Table with vs without NOT NULL
CREATE TABLE orders_strict (
    order_id        INT PRIMARY KEY,
    customer_id     INT NOT NULL,  -- Always has customer
    order_date      DATE NOT NULL  -- Always has date
);
 
CREATE TABLE orders_loose (
    order_id        INT PRIMARY KEY,
    customer_id     INT,           -- May be NULL
    order_date      DATE           -- May be NULL
);
 
-- Query on strict table: optimizer knows all rows have values
SELECT COUNT(customer_id) FROM orders_strict;
-- Optimizer can use COUNT(*) equivalent, potentially faster
 
-- Query on loose table: optimizer must check for NULLs
SELECT COUNT(customer_id) FROM orders_loose;
-- Optimizer must skip NULL customer_id values
 
 
-- Foreign key implication
-- If FK column is NOT NULL, outer join is equivalent to inner join
SELECT o.*, c.* 
FROM orders_strict o 
LEFT JOIN customers c ON o.customer_id = c.customer_id;
-- Since o.customer_id is NEVER NULL and references customers,
-- this is effectively an INNER JOIN (no unmatched rows from orders)

NOT NULL = Optimizer Hints

Think of NOT NULL constraints as information for the query optimizer, not just data validation. Stricter schemas produce more predictable performance because the optimizer has more information to work with. This is another reason to default to NOT NULL.

NOT NULL in Application Design

Database NOT NULL constraints interact with application code in important ways. Well-designed nullability at the database level simplifies application logic.

Benefits of NOT NULL at Application Level:

How NOT NULL Simplifies Code

•Eliminates Null Checks: Code doesn't need if (user.email != null) — the database guarantees it exists
•Prevents NPE/NullReferenceException: Fields mapped from NOT NULL columns are always safe to use
•Simplifies Type Mappings: ORM can use non-nullable types (String vs String?)
•Enables Compiler Safety: Languages with nullable types (Kotlin, TypeScript strict) can trust the guarantee
•Catches Bugs Early: Missing required data fails at INSERT, not in business logic later

application_nullability.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// TypeScript: Database nullability reflected in types
 
// Schema with clear nullability
interface User {
    id: number;              // NOT NULL → number
    email: string;           // NOT NULL → string
    username: string;        // NOT NULL → string
    middleName: string | null;  // NULL allowed → string | null
    phone: string | null;       // NULL allowed → string | null
}
 
// Application code benefits from NOT NULL guarantees
function sendWelcomeEmail(user: User) {
    // No null check needed for email — database guarantees it exists
    sendEmail(user.email, "Welcome!");  // Safe: user.email is always string
    
    // Optional fields still need handling
    if (user.phone) {
        sendSMS(user.phone, "Welcome!");
    }
}
 
// Kotlin example: Nullable vs non-nullable types
// fun processUser(user: User) {
//     user.email.uppercase()  // Safe: email is String, not String?
//     user.phone?.uppercase() // Must handle null: phone is String?
// }
 
 
// Java with @Nullable/@NotNull annotations from JPA/Hibernate
// @Column(nullable = false)  // Generates NOT NULL constraint
// private String email;      // Can be marked @NonNull
 
// @Column(nullable = true)
// private String phone;      // Should be marked @Nullable

Handling Required Fields in APIs:

When database columns are NOT NULL, your API validation should match:

api_validation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// API request validation matching database constraints
 
import { z } from 'zod';
 
// User creation schema matches database NOT NULL constraints
const createUserSchema = z.object({
    email: z.string().email(),         // Required: matches NOT NULL
    username: z.string().min(3),       // Required: matches NOT NULL
    password: z.string().min(8),       // Required: matches NOT NULL
    middleName: z.string().optional(), // Optional: matches nullable column
    phone: z.string().optional(),      // Optional: matches nullable column
});
 
// Update schema: required fields can't be set to null
const updateUserSchema = z.object({
    email: z.string().email().optional(),
    username: z.string().min(3).optional(),
    // NOT .nullable() — can omit field but can't set to null
    
    middleName: z.string().nullable().optional(),
    // .nullable() — can explicitly set to null
});
 
// API endpoint
async function createUser(req: Request) {
    const data = createUserSchema.parse(req.body);
    
    // If validation passes, we're guaranteed:
    // - email, username, password are present and valid
    // - middleName, phone may be undefined
    
    await db.users.create({
        data: {
            email: data.email,
            username: data.username,
            password: hashPassword(data.password),
            middleName: data.middleName ?? null,  // undefined → null
            phone: data.phone ?? null,
        }
    });
}

Database as Final Validator

Application validation is user-friendly (early, descriptive errors), but database constraints are authoritative. Even with perfect application validation, bugs can occur. NOT NULL constraints ensure data integrity regardless of application behavior.

Summary: NOT NULL Mastery

We've explored NOT NULL constraints comprehensively—from the philosophy of NULL through practical applications to optimizer effects. Let's consolidate the essential knowledge:

Key Takeaways

•NULL Is Not a Value: NULL represents absence/unknown, not zero or empty string. NULL ≠ NULL.
•Three-Valued Logic: SQL expressions evaluate to TRUE, FALSE, or NULL. WHERE clauses include only TRUE rows.
•Default to NOT NULL: Start strict; relax only when genuinely optional. Catches bugs early, simplifies code.
•Use IS NULL/IS NOT NULL: Never use = NULL or <> NULL for NULL testing.
•NOT NULL + DEFAULT = Guaranteed Value: Combine for columns that should never be NULL and have sensible defaults.
•NOT IN + NULL = Trap: If subquery contains NULL, entire NOT IN returns NULL. Use NOT EXISTS instead.
•Optimizer Benefits: NOT NULL provides information for better query plans. Stricter schemas = more predictable performance.
•Application Alignment: NOT NULL in database should match non-nullable types in application code.

What's Next:

With NOT NULL understood, we move to the most flexible constraint: CHECK. The next page covers how to define arbitrary conditions that column values must satisfy, enabling sophisticated business rule enforcement directly in the database schema.

Page Complete

You now have a comprehensive understanding of NOT NULL constraints and the deeper concept of NULL in databases. This knowledge enables you to make thoughtful decisions about nullability, avoid common NULL-related bugs, and create schemas that accurately model which data is mandatory versus optional.