Inserts new rows into a Delta table from a VALUES clause or SELECT query, with support for append, overwrite, and partition-level overwrite modes.
INSERT [INTO | OVERWRITE] <table> [(<columns>)] VALUES (...) | SELECT ...
[RETURNING <columns>]
## Overview Inserts new rows into a Delta table. Supports three write modes: append (default), full overwrite, and partition-level overwrite. The source data can come from a VALUES clause with literal row expressions or from any valid SELECT query, including joins, aggregations, and subqueries. ## Behavior ### Write Modes - **INSERT INTO (Append):** Writes new Parquet data files and records `add` actions in the transaction log. Existing data files are not modified. This is the default mode. - **INSERT OVERWRITE:** Atomically replaces all existing data. The transaction log records `remove` actions for all previous data files and `add` actions for the new files. The table schema and properties are preserved. - **INSERT OVERWRITE PARTITIONS:** Replaces only the partitions that appear in the new data. Partitions not present in the source data are left untouched. This is useful for incremental partition refreshes. ### Column Mapping When an explicit column list is provided, values are mapped by position to the listed columns. Columns not in the list receive NULL values, unless they have a DEFAULT value or are GENERATED ALWAYS columns. When no column list is provided, values are mapped by position to the full table schema. ### Identity Columns If the target table contains IDENTITY columns (GENERATED ALWAYS AS IDENTITY or GENERATED BY DEFAULT AS IDENTITY), the engine detects them automatically. For ALWAYS mode, the engine auto-generates sequential values regardless of whether the column appears in the source. For BY DEFAULT mode, the engine generates values only when the source does not supply them. ### Generated Columns Generated columns (GENERATED ALWAYS AS expressions) are computed automatically after type coercion. The source data should omit generated columns; if the source has fewer columns than the table schema and the missing columns are all generated, the engine fills them in automatically. ### Type Coercion Source values are coerced to match the target table schema. Supported coercions include integer widening (INT to BIGINT), string-to-date/timestamp parsing, and numeric precision adjustments. Incompatible types produce an error. ### Auto-Compaction If the table property `delta.autoOptimize.autoCompact` is set to `true`, the engine runs a compaction pass after the INSERT completes, combining small output files into larger, more efficient files. ## Access Control | Privilege | Object | Notes | |-----------|--------|-------| | write | Table | Required. Permits INSERT, UPDATE, and DELETE on the table. | ## Compatibility INSERT INTO and INSERT OVERWRITE follow standard SQL semantics. The OVERWRITE PARTITIONS mode and automatic identity/generated column handling are DeltaForge extensions. The RETURNING clause is parsed for forward compatibility but is not yet evaluated.
| Name | Type | Description |
|---|---|---|
table | Specifies the name or path of the target Delta table. The table must already be registered via CREATE DELTA TABLE or OPEN DELTA TABLE. Supports fully qualified names (zone.schema.table). | |
mode | Specifies the write mode. INSERT INTO (the default) appends rows without modifying existing data. INSERT OVERWRITE replaces all existing data in the table with the new rows. INSERT OVERWRITE with a PARTITIONED hint replaces only the partitions present in the new data, leaving other partitions intact. | |
columns | Specifies an explicit list of target columns in parentheses after the table name. When provided, the source values are mapped by position to the listed columns. Columns not listed receive NULL values (or their default/generated values if defined). When omitted, source values map by position to all columns in the table schema. | |
source | Specifies the data source. Accepts a VALUES clause with one or more row expressions, a SELECT query, or a combination such as SELECT * FROM (VALUES ...) AS t(col1, col2). The source column count must match either the explicit column list or the full table schema (excluding generated columns). | |
columns | Specifies an explicit list of target columns in parentheses after the table name. When provided, the source values are mapped by position to the listed columns. Columns not listed receive NULL values (or their default/generated values if defined). When omitted, source values map by position to all columns in the table schema. |
-- Insert a single row with explicit column list
INSERT INTO warehouse.sales.orders (id, product, amount)
VALUES (1, 'Widget', 9.99);
-- Insert multiple rows using VALUES
INSERT INTO warehouse.inventory.products (id, name, category, price)
VALUES
(101, 'Keyboard', 'Electronics', 49.99),
(102, 'Mouse', 'Electronics', 29.99),
(103, 'Monitor', 'Electronics', 299.99);
-- Insert from a SELECT query (cross-table copy)
INSERT INTO gold.reporting.monthly_summary
SELECT region, SUM(amount) AS total, COUNT(*) AS order_count
FROM warehouse.sales.orders
WHERE order_date >= '2024-01-01'
GROUP BY region;
-- Insert using SELECT * FROM VALUES with column aliases
INSERT INTO warehouse.sales.orders
SELECT * FROM (VALUES
(21, 'Webcam', 'Electronics', 69.99, 80, true),
(22, 'Footrest', 'Furniture', 49.99, 55, true)
) AS t(id, name, category, price, stock, is_active);
-- Overwrite all data in a table (full replacement)
INSERT OVERWRITE warehouse.staging.daily_snapshot
SELECT * FROM warehouse.raw.events
WHERE event_date = CURRENT_DATE;
-- Insert into a table with identity columns (auto-generated)
-- The id column is omitted; Delta auto-generates identity values
INSERT INTO warehouse.sales.orders (product, amount, status)
VALUES ('Cable', 12.99, 'pending');