Data Manipulation Language - Learning Module

Loading content...

0/241

DELETE Statement

The Point of No Return

The DELETE statement removes data from existence. Unlike UPDATE, which leaves a modified record behind, DELETE makes rows vanish—and in many cases, taking related data with them through cascading relationships.

This permanence makes DELETE the most consequential DML operation. A single DELETE statement executed without proper conditions can erase millions of records in seconds, potentially destroying years of accumulated data. Yet DELETE is essential—databases grow without bound if data is never removed, and many business processes require data removal (user account deletion, expired record cleanup, GDPR compliance).

This page teaches you to wield DELETE with surgical precision, understanding its mechanics, safeguards, and alternatives.

What You Will Learn

By the end of this page, you will master DELETE syntax and row identification, understand referential integrity and cascading deletes, implement soft delete patterns as an alternative, use DELETE with subqueries and joins, and apply rigorous safety practices for production deletions.

DELETE Fundamentals

The DELETE statement removes rows from a table based on a condition. Its simplicity belies its power—and its danger.

The Basic DELETE Syntax:

delete_basic_syntax.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- Complete DELETE syntax pattern
DELETE FROM table_name
WHERE condition;
 
-- Key components:
-- 1. Target table: The table from which to remove rows
-- 2. WHERE clause: Predicate identifying rows to delete
--    (CRITICAL: Omitting WHERE deletes ALL rows!)
 
-- Example: Delete a specific order
DELETE FROM orders
WHERE order_id = 5001;
 
-- This removes exactly the row where order_id = 5001
 
-- Delete multiple rows matching a condition
DELETE FROM order_items
WHERE order_id = 5001;
 
-- Removes all line items for order 5001

DELETE Without WHERE = Data Destruction

DELETE FROM table_name; (without WHERE) removes EVERY row in the table. The table structure remains, but all data is gone. There is no confirmation, no undo button, no recovery without backups. This is the most destructive single statement in SQL.

DELETE vs TRUNCATE:

SQL provides two ways to remove all rows from a table. Understanding the difference is crucial:

DELETE vs TRUNCATE Comparison
Aspect	DELETE FROM table	TRUNCATE TABLE table
Row Removal	One row at a time	All rows at once
WHERE Clause	Supported (selective)	Not supported (all rows)
Transaction Log	Logs each row deletion	Minimal logging (page deallocation)
Performance	Slower for large tables	Very fast, even for huge tables
Identity/Auto-Increment	Does not reset	Resets to initial seed
Triggers	Fires DELETE triggers	Does not fire triggers
Rollback	Can be rolled back	Depends on database (often not)
Foreign Keys	Respects constraints	May fail if referenced
Use Case	Selective deletion	Clear entire table quickly

delete_vs_truncate.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- DELETE all rows (slow but logged, triggers fire)
DELETE FROM log_entries;
-- Takes minutes for millions of rows
-- Each deletion logged to transaction log
-- DELETE triggers execute for each row
 
-- TRUNCATE all rows (fast, minimal logging)
TRUNCATE TABLE log_entries;
-- Takes seconds regardless of row count
-- Deallocates data pages directly
-- No triggers fire
-- Auto-increment resets to 1
 
-- Common pattern: Use TRUNCATE for test data cleanup
-- Reset tables to clean state before running tests
TRUNCATE TABLE test_orders;
TRUNCATE TABLE test_customers;
 
-- Note: TRUNCATE fails if table is referenced by foreign key
-- Even if no child rows exist, the FK must be dropped first

Conditional Deletion

Effective DELETE statements use precise WHERE clauses to target exactly the rows that should be removed. The same conditions available in SELECT work in DELETE.

Simple Conditions:

delete_conditions.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Delete by primary key (most precise)
DELETE FROM customers
WHERE customer_id = 42;
 
-- Delete by equality condition
DELETE FROM products
WHERE discontinued = TRUE;
 
-- Delete by comparison
DELETE FROM sessions
WHERE last_activity < CURRENT_TIMESTAMP - INTERVAL '24 hours';
 
-- Delete by range (BETWEEN)
DELETE FROM audit_logs
WHERE log_date BETWEEN '2023-01-01' AND '2023-12-31';
 
-- Delete by pattern matching (LIKE)
DELETE FROM users
WHERE email LIKE '%@tempmail.%';
 
-- Delete by list membership (IN)
DELETE FROM orders
WHERE status IN ('cancelled', 'refunded', 'failed');
 
-- Delete by NULL check
DELETE FROM products
WHERE category_id IS NULL;

Compound Conditions:

delete_compound_conditions.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Delete with AND (all conditions must be true)
DELETE FROM orders
WHERE status = 'pending'
  AND order_date < CURRENT_DATE - INTERVAL '30 days'
  AND total_amount = 0;
 
-- Delete with OR (any condition sufficient)
DELETE FROM notifications
WHERE is_read = TRUE
   OR created_at < CURRENT_DATE - INTERVAL '90 days';
 
-- Delete with complex logic
DELETE FROM sessions
WHERE (
    -- Expired sessions
    (expires_at < CURRENT_TIMESTAMP)
    OR
    -- Inactive anonymous sessions older than 1 hour
    (user_id IS NULL AND last_activity < CURRENT_TIMESTAMP - INTERVAL '1 hour')
    OR
    -- Flagged as suspicious
    (is_suspicious = TRUE AND created_at < CURRENT_DATE - INTERVAL '1 day')
);
 
-- Delete with NOT
DELETE FROM products
WHERE category_id NOT IN (
    SELECT category_id FROM active_categories
);

Test Your WHERE Clause First

Before every DELETE, replace DELETE FROM with SELECT * FROM using the same WHERE clause. Review the results. Are these exactly the rows you want to remove? Only after confirming should you execute the DELETE.

DELETE with Subqueries

Subqueries in DELETE enable deletion based on data in other tables. This is essential for maintaining referential consistency and implementing complex business rules.

Subquery in WHERE Clause:

delete_subquery.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- Delete orders for inactive customers
DELETE FROM orders
WHERE customer_id IN (
    SELECT customer_id
    FROM customers
    WHERE is_active = FALSE
);
 
-- Delete products with no sales
DELETE FROM products
WHERE product_id NOT IN (
    SELECT DISTINCT product_id
    FROM order_items
);
 
-- Delete using EXISTS (often more efficient than IN)
DELETE FROM order_items oi
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.order_id = oi.order_id
      AND o.status = 'cancelled'
);
 
-- Delete using NOT EXISTS
DELETE FROM categories c
WHERE NOT EXISTS (
    SELECT 1
    FROM products p
    WHERE p.category_id = c.category_id
);
-- Removes categories with no products

Correlated Subqueries:

Correlated subqueries reference the rows being deleted, enabling row-by-row evaluation:

delete_correlated_subquery.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Delete duplicate rows, keeping the first occurrence
DELETE FROM contacts c1
WHERE EXISTS (
    SELECT 1
    FROM contacts c2
    WHERE c2.email = c1.email
      AND c2.contact_id < c1.contact_id  -- Keep the lower ID (first inserted)
);
 
-- Delete rows where a threshold is exceeded
DELETE FROM order_items oi
WHERE (
    SELECT SUM(quantity * unit_price)
    FROM order_items sub
    WHERE sub.order_id = oi.order_id
) > 10000  -- Delete items from orders over $10,000
AND oi.is_bonus_item = TRUE;  -- Only remove bonus items
 
-- Delete old records keeping N most recent per group
DELETE FROM audit_logs al
WHERE (
    SELECT COUNT(*)
    FROM audit_logs newer
    WHERE newer.user_id = al.user_id
      AND newer.log_id > al.log_id
) >= 100  -- Keep only 100 most recent per user
;

Subquery Performance

DELETE with correlated subqueries can be slow on large tables because the subquery executes for each candidate row. For bulk deletions, consider using temporary tables or EXISTS instead of IN for better performance.

DELETE with JOIN

Some databases support DELETE with JOIN syntax, making it easier to delete rows based on joined table conditions. Like UPDATE with JOIN, the syntax varies by database system.

DELETE with JOIN Syntax Variations:

delete_join_postgres.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- PostgreSQL: DELETE ... USING
 
-- Delete order items for cancelled orders
DELETE FROM order_items oi
USING orders o
WHERE oi.order_id = o.order_id
  AND o.status = 'cancelled';
 
-- Delete with multiple tables
DELETE FROM products p
USING categories c, suppliers s
WHERE p.category_id = c.category_id
  AND p.supplier_id = s.supplier_id
  AND c.is_active = FALSE
  AND s.is_active = FALSE;
 
-- Delete based on aggregate from joined table
DELETE FROM customers c
USING (
    SELECT customer_id, SUM(total_amount) as total_spent
    FROM orders
    GROUP BY customer_id
) o
WHERE c.customer_id = o.customer_id
  AND o.total_spent = 0
  AND c.created_at < CURRENT_DATE - INTERVAL '2 years';

Multi-Table DELETE

MySQL uniquely supports deleting from multiple tables in a single statement. This can be useful for cleaning up related data but must be used with extreme caution. Specify exactly which tables to delete from (after DELETE keyword) when using joins.

Referential Integrity and Cascading Deletes

Foreign key constraints protect referential integrity—ensuring that child records don't become orphans. When you attempt to DELETE a parent row, the database checks for referencing child rows and responds based on the constraint definition.

Foreign Key ON DELETE Options:

Foreign Key ON DELETE Actions
Action	Behavior	Use Case
NO ACTION (default)	Raises error if child rows exist	When children must be manually handled first
RESTRICT	Same as NO ACTION (some DBs check timing differently)	Strict integrity enforcement
CASCADE	Automatically deletes all child rows	When deleting parent should remove all children
SET NULL	Sets foreign key column to NULL in child rows	When children can exist without parent reference
SET DEFAULT	Sets foreign key to default value	When a fallback parent exists

cascade_delete.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
-- Table structure with different ON DELETE actions
 
CREATE TABLE customers (
    customer_id INTEGER PRIMARY KEY,
    name VARCHAR(100) NOT NULL
);
 
-- Orders with CASCADE: deleting customer deletes their orders
CREATE TABLE orders (
    order_id INTEGER PRIMARY KEY,
    customer_id INTEGER REFERENCES customers(customer_id) ON DELETE CASCADE,
    order_date DATE NOT NULL
);
 
-- Order items with CASCADE: deleting order deletes items
CREATE TABLE order_items (
    item_id INTEGER PRIMARY KEY,
    order_id INTEGER REFERENCES orders(order_id) ON DELETE CASCADE,
    product_id INTEGER NOT NULL,
    quantity INTEGER NOT NULL
);
 
-- Now, deleting a customer cascades through the hierarchy:
DELETE FROM customers WHERE customer_id = 42;
-- This single DELETE removes:
-- 1. The customer row
-- 2. All orders for that customer
-- 3. All order_items for those orders
 
-- ================================================
 
-- Alternative: SET NULL for optional relationships
CREATE TABLE products (
    product_id INTEGER PRIMARY KEY,
    category_id INTEGER REFERENCES categories(category_id) ON DELETE SET NULL,
    product_name VARCHAR(100) NOT NULL
);
 
-- Deleting a category doesn't delete products—just sets category_id to NULL
DELETE FROM categories WHERE category_id = 5;
-- Products in category 5 now have category_id = NULL (uncategorized)

CASCADE Can Be Dangerous

CASCADE propagates through the entire foreign key graph. Deleting one row can trigger deletion of thousands of related rows across many tables. Always trace through CASCADE chains before defining them, and document the deletion behavior clearly.

manual_cascade.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- When you need to delete with RESTRICT/NO ACTION constraints,
-- delete children first, then parents (reverse dependency order)
 
-- Cannot delete customer if orders exist (RESTRICT/NO ACTION)
-- Solution: Delete in correct order
 
BEGIN TRANSACTION;
 
-- Step 1: Delete order items first (deepest level)
DELETE FROM order_items
WHERE order_id IN (
    SELECT order_id FROM orders WHERE customer_id = 42
);
 
-- Step 2: Delete orders (middle level)
DELETE FROM orders
WHERE customer_id = 42;
 
-- Step 3: Delete customer (parent level)
DELETE FROM customers
WHERE customer_id = 42;
 
COMMIT;

Soft Delete Pattern

Many applications avoid permanent deletion by using soft delete—marking records as deleted without actually removing them. This provides recoverability, audit trails, and historical accuracy at the cost of query complexity.

Implementing Soft Delete:

soft_delete_implementation.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
-- Table structure for soft delete
CREATE TABLE users (
    user_id INTEGER PRIMARY KEY,
    email VARCHAR(100) NOT NULL UNIQUE,
    display_name VARCHAR(100),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    -- Soft delete columns
    is_deleted BOOLEAN DEFAULT FALSE,
    deleted_at TIMESTAMP NULL,
    deleted_by INTEGER REFERENCES users(user_id)
);
 
-- "Delete" a user (actually just marks as deleted)
UPDATE users
SET 
    is_deleted = TRUE,
    deleted_at = CURRENT_TIMESTAMP,
    deleted_by = @current_admin_id
WHERE user_id = 42;
 
-- Queries must filter out deleted records
-- BAD: Returns deleted users too
SELECT * FROM users;
 
-- GOOD: Excludes deleted users
SELECT * FROM users WHERE is_deleted = FALSE;
 
-- View for easier querying
CREATE VIEW active_users AS
SELECT * FROM users WHERE is_deleted = FALSE;
 
-- Now this returns only active users
SELECT * FROM active_users;

Soft Delete With Unique Constraints:

Soft delete complicates unique constraints. A deleted user with email 'john@example.com' still occupies that unique value:

soft_delete_unique.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Problem: Unique constraint blocks reusing deleted email
-- User 42 with 'john@example.com' is soft-deleted
-- New user tries to register with 'john@example.com'
-- ERROR: Unique constraint violation!
 
-- Solution 1: Partial unique index (PostgreSQL)
CREATE UNIQUE INDEX users_email_active_unique 
ON users (email) 
WHERE is_deleted = FALSE;
-- Only enforces uniqueness among active users
 
-- Solution 2: Include deletion flag in unique constraint
-- Modify email on soft delete
UPDATE users
SET email = email || '_deleted_' || user_id
WHERE user_id = 42 AND is_deleted = FALSE;
 
UPDATE users
SET is_deleted = TRUE, deleted_at = CURRENT_TIMESTAMP
WHERE user_id = 42;
-- Original email 'john@example.com' now 'john@example.com_deleted_42'
-- New registrations can use 'john@example.com'
 
-- Solution 3: Filtered unique constraint (SQL Server)
CREATE UNIQUE NONCLUSTERED INDEX users_email_active_unique
ON users (email)
WHERE is_deleted = 0;

When to Use Soft Delete

•Regulatory Compliance — GDPR, HIPAA, or financial regulations may require data retention even after user-visible deletion.
•Undo Functionality — Allow users or admins to recover accidentally deleted records.
•Audit Trail — Maintain complete history for forensic analysis or debugging.
•Referential Integrity — Avoid cascading deletes that could destroy historical relationships.
•Reporting Accuracy — Historical reports remain accurate (e.g., 'sales by deleted products').

Consider the Trade-offs

Soft delete adds complexity: every query needs the is_deleted filter, indexes grow with deleted data, storage increases over time, and UNIQUE constraints become harder. Evaluate whether you truly need recoverability before implementing soft delete.

DELETE Safety Practices

DELETE statements are permanent. Rigorous safety practices prevent catastrophic data loss.

Essential DELETE Safety Rules

•Always start with SELECT — Write your WHERE clause in a SELECT first to see exactly which rows will be deleted.
•Use transactions — Wrap DELETE in BEGIN TRANSACTION so you can ROLLBACK if results are wrong.
•Verify row count — After DELETE, check the affected row count. Does it match your expectation?
•Use LIMIT during development — Test with DELETE ... LIMIT 1 before removing LIMIT for full execution.
•Backup before bulk deletes — For large deletions, create a backup table or database snapshot first.
•Never DELETE without WHERE in production — No exceptions. If you need all rows removed, use TRUNCATE or a specific condition.
•Document deletion criteria — Comment your DELETE explaining what and why you're deleting.

delete_safety_workflow.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- SAFE DELETE WORKFLOW
 
-- 1. Start a transaction
BEGIN TRANSACTION;
 
-- 2. Preview rows to be deleted
SELECT order_id, customer_id, order_date, status, total_amount
FROM orders
WHERE status = 'cancelled' AND order_date < '2023-01-01';
-- Result: 847 rows
 
-- 3. Backup to temporary table (optional but recommended)
CREATE TEMP TABLE orders_to_delete AS
SELECT * FROM orders
WHERE status = 'cancelled' AND order_date < '2023-01-01';
 
-- 4. Execute the DELETE
DELETE FROM orders
WHERE status = 'cancelled' AND order_date < '2023-01-01';
-- Output: "847 rows affected" — matches preview ✓
 
-- 5. Verify deletion
SELECT COUNT(*) FROM orders
WHERE status = 'cancelled' AND order_date < '2023-01-01';
-- Result: 0 ✓
 
-- 6. Commit or rollback
COMMIT;  -- If everything is correct
-- ROLLBACK; -- If something is wrong
 
-- 7. If you need to recover: 
-- INSERT INTO orders SELECT * FROM orders_to_delete;

Production DELETE Checklist

Before executing DELETE in production: (1) Do you have a complete backup? (2) Have you tested the exact query in a non-production environment? (3) Have you previewed affected rows with SELECT? (4) Does the row count match expectations? (5) Are you using a transaction? (6) Have you informed stakeholders? Never skip these checks.

Returning Deleted Data

Like INSERT and UPDATE, DELETE can return the rows it removes. This is invaluable for audit logging, confirmation messages, and undo functionality.

RETURNING/OUTPUT Clause:

delete_returning.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
-- PostgreSQL: RETURNING clause
DELETE FROM sessions
WHERE expires_at < CURRENT_TIMESTAMP
RETURNING session_id, user_id, expires_at;
 
-- Returns all columns of deleted rows
DELETE FROM products
WHERE discontinued = TRUE AND units_in_stock = 0
RETURNING *;
 
-- Use RETURNING with archive pattern
WITH deleted_orders AS (
    DELETE FROM orders
    WHERE order_date < '2022-01-01'
    RETURNING *
)
INSERT INTO orders_archive
SELECT *, CURRENT_TIMESTAMP AS archived_at
FROM deleted_orders;
-- Deletes old orders AND archives them in one statement!
 
-- SQL Server: OUTPUT clause
DELETE FROM sessions
OUTPUT DELETED.*
WHERE expiry_time < GETDATE();
 
-- Output to a table for audit
DECLARE @DeletedOrders TABLE (
    order_id INT,
    customer_id INT,
    total_amount DECIMAL(10,2),
    deleted_at DATETIME2
);
 
DELETE FROM orders
OUTPUT 
    DELETED.order_id,
    DELETED.customer_id,
    DELETED.total_amount,
    GETDATE()
INTO @DeletedOrders
WHERE status = 'cancelled';
 
-- View what was deleted
SELECT * FROM @DeletedOrders;

Delete-Archive Pattern

The DELETE ... RETURNING pattern with INSERT enables atomic archive-and-delete operations. Rows are archived and removed in a single transaction, ensuring consistency. This is the recommended approach for data lifecycle management.

Summary: Mastering DELETE

DELETE is the most consequential DML operation—data removed is data gone. Mastery means knowing not just how to delete, but when to delete, what alternatives exist, and how to delete safely.

Key Takeaways

•WHERE clause is mandatory — DELETE without WHERE removes all rows. Never execute DELETE without a condition.
•DELETE vs TRUNCATE — DELETE is logged and trigger-aware; TRUNCATE is fast but less flexible.
•Subqueries and JOINs extend power — Delete based on related table data when needed.
•Understand CASCADE implications — Foreign key cascades can propagate deletions widely; trace the chain before defining.
•Consider soft delete — For recoverability and audit needs, marking as deleted may be better than actual deletion.
•Safety practices are critical — Preview, backup, transaction, verify. Every time.
•RETURNING captures deleted data — Archive, audit, or confirm deletions atomically.

What's Next:

Having covered INSERT, UPDATE, and DELETE, the next page introduces MERGE (UPSERT)—the powerful operation that combines insertion and update logic into a single atomic statement.

Page Complete

You now have comprehensive knowledge of the DELETE statement. From simple deletions to complex cascading scenarios, from hard delete to soft delete patterns, you can remove data confidently and safely. Remember: delete with caution, backup always, and verify before committing.