MERGE INTO

Performs an atomic upsert by combining INSERT, UPDATE, and DELETE actions in a single transaction, joining a source dataset against a target Delta table on a specified condition.

Category: dml

Syntax

MERGE [WITH SCHEMA EVOLUTION] INTO <target> [AS <alias>]
USING <source> [AS <alias>] ON <condition>
[WHEN MATCHED [AND <condition>] THEN UPDATE SET ... | DELETE]
[WHEN NOT MATCHED [AND <condition>] THEN INSERT ...]
[WHEN NOT MATCHED BY SOURCE [AND <condition>] THEN UPDATE SET ... | DELETE]

Description

## Overview MERGE INTO performs an atomic upsert operation that combines INSERT, UPDATE, and DELETE actions in a single transaction. It joins a source dataset against a target Delta table using the specified ON condition and applies WHEN clauses to determine the action for each row. ## Behavior ### Clause Evaluation WHEN clauses are evaluated top to bottom for each row. The first clause whose condition matches determines the action. A row is processed by at most one clause. Three types of WHEN clauses are supported: - **WHEN MATCHED:** Applies to rows where a source row matches a target row on the ON condition. The action can be UPDATE SET (with explicit assignments or `*` for all columns) or DELETE. - **WHEN NOT MATCHED:** Applies to source rows that have no match in the target. The action is INSERT with explicit column/value pairs or `*` for all columns. - **WHEN NOT MATCHED BY SOURCE:** Applies to target rows that have no match in the source. The action can be UPDATE SET or DELETE. This is useful for detecting and removing stale rows. Each clause may include an additional AND condition that further filters which rows the clause applies to. Multiple WHEN MATCHED clauses with different AND conditions enable routing rows to different actions based on data values. ### Execution 1. The source data is evaluated (table scan, subquery execution, or VALUES materialization). 2. The engine validates column references, type compatibility, and assignment correctness. 3. A join is computed between the source and target on the ON condition. 4. For each joined pair (or unmatched row), the appropriate WHEN clause action is applied. 5. The transaction log records the resulting file additions and removals as a single atomic commit. ### Validation Before execution, MERGE performs several validation checks: - Column prefixes in the ON condition and SET clauses are verified against target and source aliases. - Column names referenced in assignments must exist in their respective schemas. - Type compatibility is checked for join columns. - Partition columns cannot appear in UPDATE SET assignments (they are immutable). - UPDATE SET * is rejected if the source schema contains partition columns, since updating partition values is not allowed. ### Schema Evolution When MERGE WITH SCHEMA EVOLUTION is specified: - Columns present in the source but absent from the target are added automatically as nullable fields. - Existing columns may be widened to a broader compatible type (e.g., INT to BIGINT, FLOAT to DOUBLE). - Partition columns cannot be evolved or widened. - The evolved schema is applied atomically as part of the same transaction. ### UPDATE SET * and INSERT * The `*` shorthand maps all source columns to target columns by name. For UPDATE SET *, every non-partition column in the source is assigned to the corresponding target column. For INSERT *, all source columns are inserted into matching target columns. ## Access Control | Privilege | Object | Notes | |-----------|--------|-------| | write | Target table | Required. Permits INSERT, UPDATE, and DELETE on the target. | | read | Source table | Required when the source is a table (not needed for VALUES). | ## Compatibility MERGE INTO follows the SQL:2003 standard syntax. DeltaForge extends it with WHEN NOT MATCHED BY SOURCE, UPDATE SET *, INSERT *, the WITH SCHEMA EVOLUTION modifier, and subquery/VALUES sources. Column prefix validation and partition immutability checks are DeltaForge-specific safeguards.

Parameters

NameTypeDescription
target_tableSpecifies the name or path of the target Delta table to merge into. The table must already be registered via CREATE DELTA TABLE or OPEN DELTA TABLE. Supports fully qualified names (zone.schema.table).
target_aliasSpecifies an alias for the target table, used to qualify column references in ON conditions and SET/INSERT clauses. Assigned with or without the AS keyword.
sourceSpecifies the source data. Accepts a table name, a subquery in parentheses, or a VALUES clause in parentheses. When a table name is provided, the engine generates SELECT * FROM that table internally. When a subquery or VALUES clause is provided, it is wrapped in a SELECT * FROM (...) expression.
source_aliasSpecifies an alias for the source data, used to qualify column references. When using a subquery source without an explicit alias, the alias defaults to 'source'.
conditionSpecifies the join condition in the ON clause. Determines how source rows are matched to target rows. Typically an equality predicate on key columns (e.g., target.id = source.id). Supports compound conditions with AND/OR and qualified column references.
schema_evolutionSpecifies whether schema evolution is enabled. When set (MERGE WITH SCHEMA EVOLUTION), columns present in the source but absent from the target are automatically added as nullable fields. Existing columns may be widened to a compatible broader type (e.g., INT to BIGINT). Partition columns cannot be evolved.

Examples

-- Basic upsert: update existing rows, insert new ones
MERGE INTO warehouse.sales.customers AS target
USING warehouse.staging.new_customers AS source
ON target.customer_id = source.customer_id
WHEN MATCHED THEN
    UPDATE SET name = source.name,
               email = source.email,
               updated_at = source.updated_at
WHEN NOT MATCHED THEN
    INSERT (customer_id, name, email, created_at, updated_at)
    VALUES (source.customer_id, source.name, source.email,
            source.created_at, source.updated_at);
-- Conditional MERGE: update active, delete discontinued, insert new
MERGE INTO warehouse.inventory.products AS target
USING warehouse.staging.product_feed AS source
ON target.sku = source.sku
WHEN MATCHED AND source.qty > 0 THEN
    UPDATE SET price = source.price,
               qty = source.qty,
               last_updated = source.last_updated
WHEN MATCHED AND source.qty = 0 THEN
    DELETE
WHEN NOT MATCHED THEN
    INSERT (id, sku, name, category, price, qty, supplier, last_updated)
    VALUES (source.id, source.sku, source.name, source.category,
            source.price, source.qty, source.supplier, source.last_updated);
-- MERGE with UPDATE SET * and INSERT * for matching schemas
MERGE INTO warehouse.sales.orders AS target
USING warehouse.staging.order_updates AS source
ON target.order_id = source.order_id
WHEN MATCHED THEN
    UPDATE SET *
WHEN NOT MATCHED THEN
    INSERT *;
-- MERGE using a subquery as source
MERGE INTO warehouse.sales.daily_totals AS target
USING (
    SELECT region, SUM(amount) AS total_amount, COUNT(*) AS order_count
    FROM warehouse.sales.orders
    WHERE order_date = CURRENT_DATE
    GROUP BY region
) AS source
ON target.region = source.region
WHEN MATCHED THEN
    UPDATE SET total_amount = source.total_amount,
               order_count = source.order_count
WHEN NOT MATCHED THEN
    INSERT (region, total_amount, order_count)
    VALUES (source.region, source.total_amount, source.order_count);
-- MERGE using VALUES as source (single-row upsert)
MERGE INTO warehouse.config.settings AS target
USING (VALUES ('max_retries', '5', CURRENT_TIMESTAMP)) AS source(key, value, updated_at)
ON target.key = source.key
WHEN MATCHED THEN
    UPDATE SET value = source.value,
               updated_at = source.updated_at
WHEN NOT MATCHED THEN
    INSERT (key, value, updated_at)
    VALUES (source.key, source.value, source.updated_at);
-- MERGE with WHEN NOT MATCHED BY SOURCE to clean stale rows
MERGE INTO warehouse.inventory.products AS target
USING warehouse.staging.full_catalog AS source
ON target.sku = source.sku
WHEN MATCHED THEN
    UPDATE SET price = source.price, qty = source.qty
WHEN NOT MATCHED THEN
    INSERT (sku, name, price, qty)
    VALUES (source.sku, source.name, source.price, source.qty)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE;

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →