PARALLEL INSERT INTO

Inserts SELECT results into an existing Delta table by splitting the source on a numeric key column and running multiple writers concurrently.

Category: dmlDeltaForge extension

Syntax

PARALLEL INSERT INTO <table>
  <select_query>
  SPLIT BY <column>
  [BATCH SIZE <n>]
  [THREADS <n>]

Description

## Overview PARALLEL INSERT INTO appends rows to an existing Delta table using the same range-partitioning strategy as PARALLEL CREATE TABLE AS. The source is split into contiguous value ranges on the SPLIT BY column and inserted concurrently. ## Behavior - The runtime computes MIN/MAX of the split column on the source, divides the value space into ranges, and emits one INSERT subquery per range plus a NULL-batch INSERT for rows with NULL in the split column. - Batches execute concurrently up to THREADS (default 8). Each batch is its own commit on the target Delta table; the operation is not atomic across batches. - All batches contribute to the same target table. Schema enforcement and constraints apply per batch. - Empty source: returns 'source is empty: no rows inserted' with no commits. - All-NULL split column: only the NULL batch executes. ## Restrictions - SPLIT BY is mandatory. The column must be numeric. - The target table must already exist; PARALLEL INSERT does not auto-create. - Each batch produces an independent commit on the target. There is no two-phase commit or rollback across batches: a partial failure leaves rows from successful batches in the table. ## Result format Returns a concurrent-execution summary table with one row per batch. Columns: `index`, `status`, `sql`, `message`, `rows_affected`, `elapsed_ms`, `error`. ## Access Control No specific privilege is required beyond what a normal INSERT needs. ## Compatibility DeltaForge extension.

Parameters

NameTypeDescription
tableSpecifies the target Delta table. Must already exist with a compatible schema.
sourceSpecifies the source SELECT query.
split_bySpecifies the column used to range-partition the source. Must be numeric.
batch_sizeApproximate rows per range batch.
threadsMaximum concurrent writers.

Examples

-- Parallel append from a staging table
PARALLEL INSERT INTO gold.facts
  SELECT * FROM silver.staging
  SPLIT BY id
  THREADS 8;
-- Restrict to a specific date range and split by day_id
PARALLEL INSERT INTO gold.facts
  SELECT *
  FROM silver.staging
  WHERE day_id BETWEEN 20260101 AND 20260131
  SPLIT BY day_id
  BATCH SIZE 25000
  THREADS 4;

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →