Data Manipulation Language - Learning Module

Loading content...

0/241

MERGE (UPSERT) Statement

The Best of Both Worlds

Imagine a common scenario: you receive a batch of data—some records are new, some are updates to existing records. The traditional approach requires checking each record, then executing INSERT or UPDATE accordingly. This creates multiple round-trips, complex application logic, and potential race conditions.

The MERGE statement (also known as UPSERT—a portmanteau of UPDATE and INSERT) solves this elegantly. In a single atomic statement, you can insert new records, update existing ones, and optionally delete records that no longer appear in the source—all based on a matching condition.

This capability is essential for data synchronization, ETL processes, inventory updates, and any scenario where external data must be reconciled with database state.

What You Will Learn

By the end of this page, you will master the MERGE statement syntax across major database systems, understand WHEN MATCHED and WHEN NOT MATCHED clauses, implement upsert patterns for data synchronization, handle complex matching conditions and multiple actions, and recognize performance considerations and alternatives.

Understanding MERGE

The MERGE statement compares a source dataset against a target table, then performs different actions based on whether rows match or not. It's declarative: you describe the conditions and actions, and the database handles the logic atomically.

The Core Concept:

merge_concept.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
-- Conceptual MERGE structure
MERGE INTO target_table AS target
USING source_table AS source
ON target.key_column = source.key_column
 
WHEN MATCHED THEN
    -- Row exists in both: UPDATE target with source values
    UPDATE SET target.column1 = source.column1, ...
 
WHEN NOT MATCHED THEN
    -- Row in source but not target: INSERT new row
    INSERT (columns...) VALUES (source.columns...);
 
-- The MERGE:
-- 1. Joins source and target on the ON condition
-- 2. For rows that match: executes the MATCHED action (UPDATE)
-- 3. For source rows without match: executes NOT MATCHED action (INSERT)
-- 4. All actions are atomic—committed together or not at all

Why MERGE Matters

•Atomicity — All operations (INSERT, UPDATE, DELETE) happen as one transaction. No partial updates.
•Performance — Single statement, single scan of source data. No multiple round-trips.
•Concurrency Safety — No race conditions between checking existence and acting on it.
•Reduced Code — Replaces complex IF EXISTS/INSERT/UPDATE logic in applications.
•Data Synchronization — Perfect for syncing tables, handling imports, and reconciling data.

The Problem MERGE Solves:

Without MERGE, the upsert pattern requires multiple statements:

without_merge.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- WITHOUT MERGE: Traditional upsert pattern (problematic)
 
-- Approach 1: Check, then insert or update
-- PROBLEM: Race condition between SELECT and INSERT/UPDATE
IF EXISTS (SELECT 1 FROM products WHERE product_id = @id)
BEGIN
    UPDATE products 
    SET product_name = @name, unit_price = @price
    WHERE product_id = @id;
END
ELSE
BEGIN
    INSERT INTO products (product_id, product_name, unit_price)
    VALUES (@id, @name, @price);
END
 
-- Approach 2: Try insert, catch duplicate, then update
-- PROBLEM: Relies on error handling, performance overhead
BEGIN TRY
    INSERT INTO products (product_id, product_name, unit_price)
    VALUES (@id, @name, @price);
END TRY
BEGIN CATCH
    IF ERROR_NUMBER() = 2627  -- Duplicate key error
        UPDATE products 
        SET product_name = @name, unit_price = @price
        WHERE product_id = @id;
END CATCH
 
-- Both approaches have issues:
-- 1. Multiple statements = multiple round-trips
-- 2. Race conditions in concurrent environments
-- 3. Complex logic that's hard to maintain

Standard MERGE Syntax (SQL:2003)

SQL:2003 standardized MERGE syntax, though implementations vary. The standard syntax is supported by Oracle, SQL Server, and DB2 with minor variations.

Complete MERGE Syntax:

merge_complete_syntax.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Standard MERGE syntax (SQL Server / Oracle style)
 
MERGE INTO target_table AS t
USING source_table AS s
ON t.key_column = s.key_column
 
-- When row exists in both source and target
WHEN MATCHED THEN
    UPDATE SET 
        t.column1 = s.column1,
        t.column2 = s.column2,
        t.updated_at = CURRENT_TIMESTAMP
 
-- When row exists only in source (new data)
WHEN NOT MATCHED BY TARGET THEN
    INSERT (key_column, column1, column2, created_at)
    VALUES (s.key_column, s.column1, s.column2, CURRENT_TIMESTAMP)
 
-- When row exists only in target (orphaned data) - SQL Server only
WHEN NOT MATCHED BY SOURCE THEN
    DELETE;  -- or UPDATE to mark as inactive
 
-- IMPORTANT: SQL Server requires semicolon after MERGE!

Practical Example: Product Inventory Sync:

merge_inventory_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
-- Scenario: Sync inventory from supplier feed into products table
 
-- Source: Daily supplier feed (temporary table or table variable)
CREATE TABLE #supplier_feed (
    sku             VARCHAR(50) PRIMARY KEY,
    product_name    VARCHAR(200),
    unit_price      DECIMAL(10, 2),
    quantity        INTEGER,
    last_updated    DATETIME
);
 
-- Insert sample supplier data
INSERT INTO #supplier_feed VALUES
    ('SKU001', 'Wireless Mouse', 29.99, 150, '2024-01-15'),
    ('SKU002', 'USB Keyboard', 49.99, 75, '2024-01-15'),
    ('SKU003', 'Monitor Stand', 34.99, 200, '2024-01-15'),  -- New product
    ('SKU004', 'Desk Lamp', 24.99, 50, '2024-01-15');        -- New product
 
-- MERGE: Sync supplier feed with products table
MERGE INTO products AS target
USING #supplier_feed AS source
ON target.sku = source.sku
 
WHEN MATCHED THEN
    -- Update existing products with new prices and quantities
    UPDATE SET 
        target.product_name = source.product_name,
        target.unit_price = source.unit_price,
        target.units_in_stock = source.quantity,
        target.last_sync = CURRENT_TIMESTAMP
 
WHEN NOT MATCHED BY TARGET THEN
    -- Insert new products from supplier
    INSERT (sku, product_name, unit_price, units_in_stock, created_at, last_sync)
    VALUES (source.sku, source.product_name, source.unit_price, 
            source.quantity, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP);
 
-- Result: 
-- Existing products (SKU001, SKU002) are updated
-- New products (SKU003, SKU004) are inserted
-- All in one atomic operation!

Semicolon Required in SQL Server

SQL Server requires MERGE statements to end with a semicolon. This is unlike other T-SQL statements where the semicolon is optional. Forgetting it causes syntax errors.

Database-Specific Implementations

While the MERGE concept is universal, syntax varies significantly between database systems. Some databases use alternative syntax rather than the standard MERGE statement.

PostgreSQL uses INSERT ... ON CONFLICT (available since 9.5):

postgresql_upsert.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
-- PostgreSQL: INSERT ... ON CONFLICT (Upsert)
 
-- Basic upsert: Insert or update on primary key conflict
INSERT INTO products (product_id, product_name, unit_price, units_in_stock)
VALUES (101, 'Wireless Mouse', 29.99, 150)
ON CONFLICT (product_id) 
DO UPDATE SET 
    product_name = EXCLUDED.product_name,
    unit_price = EXCLUDED.unit_price,
    units_in_stock = EXCLUDED.units_in_stock,
    updated_at = CURRENT_TIMESTAMP;
 
-- EXCLUDED refers to the row that was attempted to be inserted
 
-- Upsert on unique constraint (by name)
INSERT INTO products (product_id, product_name, unit_price)
VALUES (102, 'Mechanical Keyboard', 149.99)
ON CONFLICT ON CONSTRAINT products_name_unique
DO UPDATE SET unit_price = EXCLUDED.unit_price;
 
-- Upsert with conditional update
INSERT INTO products (product_id, product_name, unit_price)
VALUES (103, 'Monitor', 299.99)
ON CONFLICT (product_id) 
DO UPDATE SET 
    unit_price = EXCLUDED.unit_price
WHERE products.unit_price IS DISTINCT FROM EXCLUDED.unit_price;
-- Only updates if price actually changed
 
-- Upsert: Do nothing on conflict (insert if not exists)
INSERT INTO products (product_id, product_name, unit_price)
VALUES (104, 'Desk Lamp', 24.99)
ON CONFLICT (product_id) DO NOTHING;
 
-- PostgreSQL 15+ also supports standard MERGE syntax
MERGE INTO products AS t
USING (VALUES (101, 'Wireless Mouse', 29.99)) AS s(product_id, product_name, unit_price)
ON t.product_id = s.product_id
WHEN MATCHED THEN UPDATE SET product_name = s.product_name
WHEN NOT MATCHED THEN INSERT VALUES (s.product_id, s.product_name, s.unit_price);

Advanced MERGE Patterns

MERGE becomes even more powerful with conditional actions, multiple WHEN clauses, and complex source queries.

Conditional Actions:

merge_conditional.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
-- MERGE with conditional WHEN MATCHED clauses (SQL Server)
 
MERGE INTO products AS target
USING staging_products AS source
ON target.product_id = source.product_id
 
-- Update only if data actually changed (avoid unnecessary updates)
WHEN MATCHED AND (
    target.product_name != source.product_name
    OR target.unit_price != source.unit_price
    OR target.category_id != source.category_id
) THEN
    UPDATE SET 
        target.product_name = source.product_name,
        target.unit_price = source.unit_price,
        target.category_id = source.category_id,
        target.updated_at = CURRENT_TIMESTAMP
 
-- When matched but no changes, do nothing (implicit)
 
WHEN NOT MATCHED BY TARGET THEN
    INSERT (product_id, product_name, unit_price, category_id, created_at)
    VALUES (source.product_id, source.product_name, source.unit_price, 
            source.category_id, CURRENT_TIMESTAMP);
 
-- ================================================
 
-- Multiple WHEN MATCHED clauses with different conditions
MERGE INTO inventory AS target
USING warehouse_shipment AS source
ON target.sku = source.sku
 
-- If quantity drops to zero, mark as out of stock
WHEN MATCHED AND target.quantity + source.quantity <= 0 THEN
    UPDATE SET 
        target.quantity = 0,
        target.status = 'out_of_stock',
        target.updated_at = CURRENT_TIMESTAMP
 
-- If quantity is positive, update normally
WHEN MATCHED AND target.quantity + source.quantity > 0 THEN
    UPDATE SET 
        target.quantity = target.quantity + source.quantity,
        target.status = CASE 
            WHEN target.quantity + source.quantity < 10 THEN 'low_stock'
            ELSE 'in_stock'
        END,
        target.updated_at = CURRENT_TIMESTAMP
 
WHEN NOT MATCHED THEN
    INSERT (sku, quantity, status, created_at)
    VALUES (source.sku, source.quantity, 'in_stock', CURRENT_TIMESTAMP);

MERGE with Aggregated Source:

merge_aggregated_source.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- MERGE using aggregated/transformed source data
 
-- Update category statistics from orders
MERGE INTO category_stats AS target
USING (
    SELECT 
        p.category_id,
        COUNT(DISTINCT o.order_id) AS total_orders,
        SUM(oi.quantity) AS total_units_sold,
        SUM(oi.quantity * oi.unit_price) AS total_revenue,
        AVG(oi.unit_price) AS avg_unit_price
    FROM order_items oi
    JOIN products p ON oi.product_id = p.product_id
    JOIN orders o ON oi.order_id = o.order_id
    WHERE o.order_date >= DATEADD(MONTH, -1, GETDATE())
    GROUP BY p.category_id
) AS source
ON target.category_id = source.category_id
 
WHEN MATCHED THEN
    UPDATE SET 
        target.monthly_orders = source.total_orders,
        target.monthly_units = source.total_units_sold,
        target.monthly_revenue = source.total_revenue,
        target.avg_price = source.avg_unit_price,
        target.last_calculated = CURRENT_TIMESTAMP
 
WHEN NOT MATCHED BY TARGET THEN
    INSERT (category_id, monthly_orders, monthly_units, monthly_revenue, avg_price, last_calculated)
    VALUES (source.category_id, source.total_orders, source.total_units_sold, 
            source.total_revenue, source.avg_unit_price, CURRENT_TIMESTAMP)
 
-- Clean up categories with no recent sales
WHEN NOT MATCHED BY SOURCE THEN
    UPDATE SET 
        target.monthly_orders = 0,
        target.monthly_units = 0,
        target.monthly_revenue = 0,
        target.last_calculated = CURRENT_TIMESTAMP;

Order of WHEN Clauses

When using multiple WHEN MATCHED clauses with conditions, the first matching condition wins. Order them from most specific to least specific, similar to IF/ELSE IF chains in programming.

MERGE with OUTPUT for Auditing

SQL Server's MERGE statement supports the OUTPUT clause, which captures what actions were taken. This is invaluable for auditing, logging, and debugging synchronization operations.

OUTPUT Clause with MERGE:

merge_with_output.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
-- MERGE with OUTPUT for complete audit trail
 
-- Create audit table
CREATE TABLE product_sync_audit (
    audit_id        INT IDENTITY PRIMARY KEY,
    action          VARCHAR(10),
    product_id      INT,
    old_name        VARCHAR(200),
    new_name        VARCHAR(200),
    old_price       DECIMAL(10,2),
    new_price       DECIMAL(10,2),
    sync_timestamp  DATETIME2 DEFAULT CURRENT_TIMESTAMP
);
 
-- MERGE with OUTPUT into audit table
MERGE INTO products AS target
USING staging_products AS source
ON target.product_id = source.product_id
 
WHEN MATCHED THEN
    UPDATE SET 
        target.product_name = source.product_name,
        target.unit_price = source.unit_price
 
WHEN NOT MATCHED BY TARGET THEN
    INSERT (product_id, product_name, unit_price)
    VALUES (source.product_id, source.product_name, source.unit_price)
 
WHEN NOT MATCHED BY SOURCE AND target.is_synced = 1 THEN
    DELETE
 
-- OUTPUT captures before and after states
OUTPUT 
    $action AS action,                  -- 'INSERT', 'UPDATE', or 'DELETE'
    COALESCE(INSERTED.product_id, DELETED.product_id) AS product_id,
    DELETED.product_name AS old_name,   -- NULL for INSERT
    INSERTED.product_name AS new_name,  -- NULL for DELETE
    DELETED.unit_price AS old_price,
    INSERTED.unit_price AS new_price
INTO product_sync_audit (action, product_id, old_name, new_name, old_price, new_price);
 
-- Query the audit to see what happened
SELECT * FROM product_sync_audit ORDER BY audit_id;
 
-- Results might show:
-- action  | product_id | old_name        | new_name           | old_price | new_price
-- UPDATE  | 101        | Wireless Mouse  | Wireless Mouse v2  | 29.99     | 34.99
-- UPDATE  | 102        | Keyboard        | Mechanical Keyboard| 49.99     | 149.99
-- INSERT  | 103        | NULL            | New Monitor        | NULL      | 299.99
-- DELETE  | 999        | Discontinued    | NULL               | 19.99     | NULL

Summary Statistics from OUTPUT:

merge_output_summary.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- Capture MERGE results into table variable for summary
 
DECLARE @MergeResults TABLE (action VARCHAR(10));
 
MERGE INTO products AS target
USING staging_products AS source
ON target.product_id = source.product_id
 
WHEN MATCHED THEN UPDATE SET target.product_name = source.product_name
WHEN NOT MATCHED BY TARGET THEN INSERT (product_id, product_name) VALUES (source.product_id, source.product_name)
WHEN NOT MATCHED BY SOURCE THEN DELETE
 
OUTPUT $action INTO @MergeResults;
 
-- Generate summary
SELECT 
    action AS Operation,
    COUNT(*) AS RowCount
FROM @MergeResults
GROUP BY action;
 
-- Results:
-- Operation | RowCount
-- INSERT    | 15
-- UPDATE    | 42
-- DELETE    | 3

PostgreSQL RETURNING Alternative

PostgreSQL's INSERT ON CONFLICT also supports RETURNING, though with different semantics. For complex MERGE-like operations with full audit trails, consider using CTEs with RETURNING or separate INSERT/UPDATE statements with RETURNING captured.

Common UPSERT Patterns

Certain upsert patterns appear repeatedly in real applications. Mastering these common patterns accelerates development.

Pattern 1: Insert-or-Update-All:

upsert_pattern_full.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Pattern 1: Full replacement - update all columns on match
 
-- PostgreSQL
INSERT INTO products (product_id, name, price, stock, category, updated_at)
VALUES (101, 'Widget', 9.99, 100, 'Gadgets', NOW())
ON CONFLICT (product_id) 
DO UPDATE SET
    name = EXCLUDED.name,
    price = EXCLUDED.price,
    stock = EXCLUDED.stock,
    category = EXCLUDED.category,
    updated_at = EXCLUDED.updated_at;
 
-- MySQL
INSERT INTO products (product_id, name, price, stock, category, updated_at)
VALUES (101, 'Widget', 9.99, 100, 'Gadgets', NOW())
ON DUPLICATE KEY UPDATE
    name = VALUES(name),
    price = VALUES(price),
    stock = VALUES(stock),
    category = VALUES(category),
    updated_at = VALUES(updated_at);

Pattern 2: Insert-or-Increment:

upsert_pattern_increment.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- Pattern 2: Insert new, increment existing (counters, aggregates)
 
-- PostgreSQL: Page view counter
INSERT INTO page_views (page_url, view_count, first_view, last_view)
VALUES ('/products/widget', 1, NOW(), NOW())
ON CONFLICT (page_url) 
DO UPDATE SET
    view_count = page_views.view_count + 1,
    last_view = NOW();
 
-- MySQL: Inventory adjustment
INSERT INTO inventory (sku, quantity, last_updated)
VALUES ('SKU-001', 50, NOW())
ON DUPLICATE KEY UPDATE
    quantity = quantity + VALUES(quantity),
    last_updated = VALUES(last_updated);
 
-- SQL Server: Running totals
MERGE INTO daily_sales AS target
USING (SELECT @date AS sale_date, @amount AS amount) AS source
ON target.sale_date = source.sale_date
 
WHEN MATCHED THEN
    UPDATE SET 
        target.total_amount = target.total_amount + source.amount,
        target.transaction_count = target.transaction_count + 1
 
WHEN NOT MATCHED THEN
    INSERT (sale_date, total_amount, transaction_count)
    VALUES (source.sale_date, source.amount, 1);

Pattern 3: Insert-or-Ignore:

upsert_pattern_ignore.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Pattern 3: Insert only if not exists (idempotent inserts)
 
-- PostgreSQL
INSERT INTO users (email, name, created_at)
VALUES ('user@example.com', 'John Doe', NOW())
ON CONFLICT (email) DO NOTHING;
-- Silently ignores if email already exists
 
-- MySQL
INSERT IGNORE INTO users (email, name, created_at)
VALUES ('user@example.com', 'John Doe', NOW());
 
-- SQLite
INSERT OR IGNORE INTO users (email, name, created_at)
VALUES ('user@example.com', 'John Doe', datetime('now'));
 
-- SQL Server: Conditional MERGE
MERGE INTO users AS target
USING (SELECT 'user@example.com' AS email, 'John Doe' AS name) AS source
ON target.email = source.email
WHEN NOT MATCHED THEN
    INSERT (email, name, created_at)
    VALUES (source.email, source.name, GETDATE());

Pattern 4: Conditional Update (update only if certain conditions):

upsert_pattern_conditional.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Pattern 4: Update only under certain conditions
 
-- Only update if new price is lower (price protection)
INSERT INTO products (product_id, name, price)
VALUES (101, 'Widget', 7.99)
ON CONFLICT (product_id) 
DO UPDATE SET price = EXCLUDED.price
WHERE products.price > EXCLUDED.price;
-- Only reduces price, never increases
 
-- Only update if source is newer (timestamp-based dedup)
INSERT INTO events (event_id, event_data, event_time)
VALUES ('evt-123', '{"action": "click"}', '2024-01-15 10:30:00')
ON CONFLICT (event_id) 
DO UPDATE SET
    event_data = EXCLUDED.event_data,
    event_time = EXCLUDED.event_time
WHERE events.event_time < EXCLUDED.event_time;
-- Only updates if incoming event is newer

Performance Considerations

MERGE operations can be resource-intensive. Understanding performance factors helps optimize synchronization operations.

MERGE Performance Factors

•Index on Join Key — The ON clause condition must be efficiently evaluated. Index the matching column(s) in both source and target.
•Source Size — Large source datasets mean more rows to compare. Consider batching very large syncs.
•Trigger Overhead — INSERT/UPDATE/DELETE triggers fire for each affected row during MERGE, potentially slowing operations.
•Lock Duration — MERGE locks target rows for the duration. Large operations may cause blocking.
•Transaction Log — All operations are logged. Very large MERGE may fill transaction log.

merge_performance_tips.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
-- Performance optimization techniques
 
-- 1. Ensure index exists on join column
CREATE INDEX idx_products_sku ON products (sku);
 
-- 2. Batch large operations
-- Instead of one MERGE with 1 million rows, do 10 batches of 100K
DECLARE @BatchSize INT = 100000;
DECLARE @Offset INT = 0;
 
WHILE EXISTS (
    SELECT 1 FROM staging_products 
    WHERE row_num > @Offset AND row_num <= @Offset + @BatchSize
)
BEGIN
    MERGE INTO products AS target
    USING (
        SELECT * FROM staging_products 
        WHERE row_num > @Offset AND row_num <= @Offset + @BatchSize
    ) AS source
    ON target.product_id = source.product_id
    WHEN MATCHED THEN UPDATE SET target.name = source.name
    WHEN NOT MATCHED THEN INSERT (product_id, name) VALUES (source.product_id, source.name);
    
    SET @Offset = @Offset + @BatchSize;
END
 
-- 3. Filter source to only changed rows
MERGE INTO products AS target
USING (
    SELECT * FROM staging_products s
    WHERE NOT EXISTS (
        SELECT 1 FROM products p
        WHERE p.product_id = s.product_id
          AND p.name = s.name
          AND p.price = s.price
    )
) AS source
ON target.product_id = source.product_id
-- Only processes rows that actually differ
 
-- 4. Consider simple INSERT + UPDATE for massive data
-- Sometimes separate statements outperform MERGE
BEGIN TRANSACTION;
    UPDATE products SET name = s.name, price = s.price
    FROM products p
    JOIN staging_products s ON p.product_id = s.product_id;
    
    INSERT INTO products (product_id, name, price)
    SELECT s.product_id, s.name, s.price
    FROM staging_products s
    LEFT JOIN products p ON s.product_id = p.product_id
    WHERE p.product_id IS NULL;
COMMIT;

MERGE Not Always Optimal

Despite its elegance, MERGE isn't always the fastest approach. For extremely large datasets or when you know most rows will be updates (or most will be inserts), separate statements may perform better. Benchmark both approaches for your specific workload.

Summary: Mastering MERGE

The MERGE statement unifies INSERT, UPDATE, and DELETE into a single atomic operation, solving the classic upsert problem elegantly and efficiently.

Key Takeaways

•MERGE is atomic — All matching and non-matching actions execute in one transaction, eliminating race conditions.
•Syntax varies by database — PostgreSQL uses ON CONFLICT, MySQL uses ON DUPLICATE KEY, SQL Server/Oracle use standard MERGE.
•Multiple WHEN clauses enable complex logic — Different conditions can trigger different actions for matched rows.
•OUTPUT/RETURNING captures results — Track what actions were taken for auditing and verification.
•Common patterns cover most use cases — Insert-or-update-all, increment counters, ignore duplicates, conditional updates.
•Performance requires attention — Index join columns, batch large operations, consider separate statements for extreme scale.

What's Next:

Having covered individual DML statements, the next page examines bulk operations—efficient techniques for handling large data volumes through batch processing, bulk inserts, and ETL optimization.

Page Complete

You now have comprehensive knowledge of the MERGE/UPSERT pattern across major database systems. From basic upserts to complex conditional synchronization with audit trails, you can efficiently reconcile external data with your database tables.