CREATE DELTA TABLE

Creates a new Delta table with an explicit schema, optional partitioning, liquid clustering, table properties, and an optional LOCATION. Supports CREATE TABLE AS SELECT (CTAS).

Category: ddlDeltaForge extension

Syntax

CREATE DELTA TABLE [IF NOT EXISTS] <table> (
  <col1> <type1> [NOT NULL] [DEFAULT <val>] [COMMENT '<desc>'],
  ...
)
[PARTITIONED BY (<columns>)]
[CLUSTER BY (<columns>)]
[LOCATION '<path>']
[TBLPROPERTIES (<key>=<val>, ...)]
[COMMENT '<description>']
[AS SELECT ...]

Description

## Overview Creates a new Delta table at the specified LOCATION. The table schema, partitioning, clustering, and properties are committed as the initial version (version 0) of the Delta log. The table is immediately registered with the DataFusion session for use in subsequent SQL statements. ## Behavior - LOCATION is required. The path determines where the `_delta_log/` directory and Parquet data files are stored. - If IF NOT EXISTS is specified and a Delta log already exists at the LOCATION, the table is registered for the current session without creating a new table or modifying existing data. - All column names are normalized to lowercase to prevent case-sensitivity issues between SQL identifiers and Parquet column names. - For CTAS, the schema is inferred from the query plan. Column names from the query are normalized to lowercase. Data is written using a streaming pipeline: local paths use DirectFileWriter for maximum throughput; cloud paths use StreamingWriter via the FileSystemClient. - CTAS defaults to file sizes of 128-153 MB to avoid the small-file problem. Override with WITH FILE SIZE or WITH BATCH SIZE. - Partition expressions (e.g., CAST(timestamp AS DATE)) create hidden generated partition columns with `delta.generationExpression` metadata. - Clustering columns are normalized to lowercase. Liquid clustering and PARTITIONED BY are mutually exclusive. - If the table name is 3-part qualified (zone.schema.table), it is automatically persisted to the SQLite catalog. - After creation, the schema observer is notified so column metadata appears in the catalog immediately. ## Compatibility CREATE DELTA TABLE follows the open Delta Lake protocol. Tables created by this command are readable by any Delta-compatible engine. Advanced features (identity columns, generated columns, deletion vectors) require the appropriate protocol versions.

Parameters

NameTypeDescription
tableTable name, optionally fully qualified as zone.schema.table. Three-part names are persisted to the catalog automatically.
columnsColumn definitions with types and constraints.
partition_columnsPartition columns.
clustering_columnsClustering columns (liquid clustering).
locationStorage location path.
propertiesTable properties.
commentTable description.
if_not_existsDon't error if table exists.
as_queryCreate Table As Select query.
batch_sizeRows per parquet file in CTAS.
file_size_rangeFile size range in MB for CTAS.

Examples

-- Create a simple table
CREATE DELTA TABLE warehouse.sales.products (
  product_id BIGINT NOT NULL,
  name STRING NOT NULL,
  price DECIMAL(10,2) CHECK (price >= 0),
  category STRING
)
LOCATION '/data/warehouse/sales/products';
-- Create a table with identity and generated columns
CREATE DELTA TABLE warehouse.sales.orders (
  id BIGINT GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1) NOT NULL,
  order_date DATE NOT NULL,
  total DECIMAL(12,2),
  year INT GENERATED ALWAYS AS (YEAR(order_date))
)
PARTITIONED BY (year)
LOCATION '/data/warehouse/sales/orders';
-- Create a table with liquid clustering
CREATE DELTA TABLE warehouse.sales.events (
  event_id BIGINT NOT NULL,
  event_type STRING,
  event_time TIMESTAMP NOT NULL,
  payload STRING
)
CLUSTER BY (event_type, event_time)
LOCATION '/data/warehouse/sales/events'
COMMENT 'Raw event stream';
-- CTAS: create and populate from a query
CREATE DELTA TABLE warehouse.sales.monthly_summary
LOCATION '/data/warehouse/sales/monthly_summary'
AS SELECT
  DATE_TRUNC('month', order_date) AS month,
  COUNT(*) AS order_count,
  SUM(total) AS revenue
FROM warehouse.sales.orders
GROUP BY 1;
-- Create with table properties and file size control
CREATE DELTA TABLE IF NOT EXISTS warehouse.logs.access_log (
  ts TIMESTAMP NOT NULL,
  method STRING,
  path STRING,
  status INT
)
LOCATION 's3://my-bucket/logs/access'
TBLPROPERTIES (
  'delta.enableDeletionVectors' = 'true',
  'delta.autoOptimize.optimizeWrite' = 'true'
);

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →