UPDATED

Returns the post-image of every row whose key exists at both versions but whose non-key column values differ, by diffing two time-travel snapshots. One of the DELETED / UPDATED / INSERTED change-extraction trio.

Category: time-travelReturns: SETOF row-of(table)Dialect: DeltaForgePrivilege: readDeltaForge extension

Syntax

UPDATED(table_name TEXT [, from_version BIGINT, to_version BIGINT,] [key_array TEXT[]])

Description

## Overview Returns the post-image of every row whose key exists at both `from_version` and `to_version` but whose non-key column values differ. Implemented as a snapshot diff: an INNER JOIN between two time-travel snapshots of the same table on the named key columns, followed by a NULL-aware `IS DISTINCT FROM` filter across the non-key columns. The result schema matches the source table's row schema; callers receive full source-table rows reflecting the *current* (post-image) values. UPDATED is one of the **DELETED / UPDATED / INSERTED** change-extraction trio. Together they let you partition any commit (or version range) into deletes, updates, and inserts using the same KEY columns; each operator surfaces exactly one slice and the three slices are non-overlapping. This is the standard primitive for incremental ETL on Delta tables. Unlike CDF-based change extraction, UPDATED works on every Delta table without requiring `delta.enableChangeDataFeed = true` to have been set at write time. ## Behavior - The 1-arg form `UPDATED('schema.table')` returns updates from the most recent commit. Requires a declared PRIMARY KEY constraint on the source table. - The 2-arg form `UPDATED('schema.table', ARRAY['key'])` returns updates from the most recent commit using an explicit key. - The 3-arg form `UPDATED('schema.table', from, to)` returns updates between versions [from, to]. Requires a declared PRIMARY KEY. - The 4-arg form `UPDATED('schema.table', from, to, ARRAY['key'])` returns updates between versions [from, to] using an explicit key. - Identity resolution: explicit KEY argument > PRIMARY KEY constraint. There is no fallback when neither is supplied; the function errors at planning time. - Returns the V_to (post-image) row only. The pre-image is not exposed through UPDATED; if you need both images, run a parallel query at `VERSION AS OF :from` joined on the same key. - Inserts and deletes are excluded. Use `DELETED` for deletes; new inserts can be obtained from `INSERTED`. ## NULL-aware comparison The non-key column comparison uses SQL's `IS DISTINCT FROM` operator, which treats `(NULL, NULL)` as equal and `(NULL, value)` as different. This avoids the common pitfall of `<>` returning UNKNOWN on NULLs and silently dropping rows whose only change was NULL → value or value → NULL. ## Versioning semantics Version bounds are inclusive on both ends. `UPDATED('t', 5, 5, ARRAY['id'])` returns post-images of rows updated by the single commit that produced version 5 (i.e. present at version 4 and version 5 with the same key but different non-key columns). ## Composition with the trio DELETED, UPDATED, and INSERTED share an identical surface and identical identity-resolution rules. Used together on the same version range and the same KEY they cleanly partition every change in the commit window: - A row whose key exists at `from` and not at `to` → DELETED. - A row whose key exists at both versions, with at least one non-key column differing → UPDATED. - A row whose key exists at `to` and not at `from` → INSERTED. - A row whose key exists at both versions with identical non-key columns → no operator returns it. This property is exploited by the canonical incremental-ETL template: run all three against `(:last_seen, :current)`, route each result set to the corresponding sink action (DELETE, MERGE/UPSERT, INSERT), advance `:last_seen` to `:current`, repeat. ## Composition The TVF is a regular relation: it pushes projection (column subset), filter (predicate), and limit through to both child snapshot scans, so it composes naturally inside a wider query: `SELECT region, count(*) FROM UPDATED('t', 5, 10, ARRAY['id']) WHERE region = 'EU' GROUP BY region` reads only EU files at both versions. ## Compatibility DeltaForge-native. The semantic equivalent in standard SQL is `SELECT v_to.* FROM t VERSION AS OF :from v_from JOIN t VERSION AS OF :to v_to ON v_from.id = v_to.id WHERE (v_from.col1 IS DISTINCT FROM v_to.col1) OR (v_from.col2 IS DISTINCT FROM v_to.col2) OR ...`. UPDATED is the readable shorthand. ## Access Control | Privilege | Object | Notes | |-----------|--------|-------| | read | Table | Required on the source table. Two snapshots of the same table are read; a single read grant suffices. |

Parameters

NameTypeDescription
table_nameSpecifies the source Delta table as a string literal. Accepts a one-, two-, or three-part name (table, schema.table, or zone.schema.table). The table must be a Delta table; external tables and in-memory tables are rejected because they have no version history.
from_versionInclusive lower-bound version of the range to diff. The function compares the snapshot at this version against the snapshot at to_version. When omitted (1- or 2-arg forms), defaults to latest - 1.
to_versionInclusive upper-bound version of the range to diff. Must be greater than or equal to from_version. When omitted (1- or 2-arg forms), defaults to the table's current latest version.
key_arrayIdentity columns used to match rows across the two versions. Must be supplied either as an explicit ARRAY[...] argument or via a PRIMARY KEY constraint on the source table. Without identity there is no notion of 'the same row across versions' and the function errors at planning time.

Examples

-- Audit which customers had attribute changes since the last checkpoint
SELECT * FROM UPDATED('silver.customer', :last_seen, :current, ARRAY['customer_id']);
-- Latest-commit shorthand on a table with a PRIMARY KEY
SELECT * FROM UPDATED('silver.customer');
-- Same shorthand but with an explicit key (when no PRIMARY KEY is declared)
SELECT * FROM UPDATED('silver.customer', ARRAY['customer_id']);
-- ETL pattern: load only changed rows into a downstream stage
INSERT INTO warehouse.stg_customer_changes
SELECT * FROM UPDATED('silver.customer', :last_seen, :current, ARRAY['customer_id']);
-- Composite key spanning two columns
SELECT * FROM UPDATED('silver.order_line', 50, 60, ARRAY['order_id', 'line_no']);
-- Compose with JOIN to attach context that is not in the source row
SELECT u.customer_id, u.email_new, r.region
  FROM UPDATED('silver.customer', 100, 110, ARRAY['customer_id']) u
  JOIN silver.region r ON u.region_id = r.id;

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →