Read a flat JSON array file into an external table with include_paths, type inference, and file_metadata.
-- ============================================================================
-- Setup: zone, schema, and external table over customers.json
-- ============================================================================
CREATE ZONE IF NOT EXISTS {{zone_name}} TYPE EXTERNAL
COMMENT 'External tables — demo datasets and file-backed data';
CREATE SCHEMA IF NOT EXISTS {{zone_name}}.json_demos
COMMENT 'JSON-backed external tables';
CREATE EXTERNAL TABLE IF NOT EXISTS {{zone_name}}.json_demos.customers
USING JSON
LOCATION '{{data_path}}'
OPTIONS (
json_flatten_config = '{
"root_path": "$",
"include_paths": [
"$.id",
"$.email",
"$.first",
"$.last",
"$.company",
"$.created_at",
"$.country"
],
"max_depth": 1,
"separator": "_",
"default_array_handling": "to_json",
"infer_types": true
}',
file_metadata = '{"columns":["df_file_name","df_row_number"]}'
);
GRANT ADMIN ON TABLE {{zone_name}}.json_demos.customers TO USER {{current_user}};
-- ============================================================================
-- Queries: verify row count, column mapping, type inference, metadata
-- ============================================================================
-- Total row count (array length)
ASSERT ROW_COUNT = 200
SELECT * FROM {{zone_name}}.json_demos.customers;
-- Browse with auto-detected column names
ASSERT ROW_COUNT = 10
ASSERT VALUE first = 'Torrey' WHERE id = '1'
ASSERT VALUE country = 'Switzerland' WHERE id = '1'
SELECT id, email, first, last, company, created_at, country
FROM {{zone_name}}.json_demos.customers
ORDER BY id
LIMIT 10;
-- Type inference: created_at is a real TIMESTAMP, so MIN/MAX work
ASSERT ROW_COUNT = 1
ASSERT VALUE earliest_signup >= '2014-04-02'
ASSERT VALUE latest_signup < '2015-03-22'
SELECT MIN(created_at) AS earliest_signup,
MAX(created_at) AS latest_signup
FROM {{zone_name}}.json_demos.customers;
-- Country distribution (top 10)
ASSERT ROW_COUNT = 10
ASSERT VALUE customer_count = 4 WHERE country = 'Congo'
SELECT country, COUNT(*) AS customer_count
FROM {{zone_name}}.json_demos.customers
GROUP BY country
ORDER BY customer_count DESC
LIMIT 10;
-- file_metadata: df_file_name + row numbering 1..200
ASSERT ROW_COUNT = 1
ASSERT VALUE df_file_name LIKE '%customers%'
ASSERT VALUE first_row = 1
ASSERT VALUE last_row = 200
SELECT df_file_name,
MIN(df_row_number) AS first_row,
MAX(df_row_number) AS last_row
FROM {{zone_name}}.json_demos.customers
GROUP BY df_file_name;
-- Required-field integrity
ASSERT VALUE null_row_count = 0
SELECT COUNT(*) AS null_row_count
FROM {{zone_name}}.json_demos.customers
WHERE id IS NULL OR email IS NULL
OR first IS NULL OR last IS NULL
OR country IS NULL;
-- ============================================================================
-- Cleanup
-- ============================================================================
DROP EXTERNAL TABLE IF EXISTS {{zone_name}}.json_demos.customers WITH FILES;
DROP SCHEMA IF EXISTS {{zone_name}}.json_demos;
DROP ZONE IF EXISTS {{zone_name}};
## When to Use Use this demo when you have a single JSON file containing a top-level array of flat objects (a classic CRM / API export shape) and you want a queryable external table with proper types and source-file provenance. It is the canonical starting point for JSON in DeltaForge: no nesting, no explode, just `json_flatten_config` with `include_paths`, `infer_types`, and `file_metadata` on top of `CREATE EXTERNAL TABLE ... USING JSON`. ## What You Will Learn 1. How to register a JSON array file (`[{...}, {...}]`) as an external table. 2. How `json_flatten_config.root_path = "$"` tells DeltaForge to emit one row per array element. 3. How `include_paths` restricts the projection to just the fields you need. 4. How `max_depth: 1` skips any (absent) nested structures and keeps the table flat. 5. How `infer_types: true` promotes ISO-8601 strings (e.g. `2014-12-25T04:06:27.981Z`) to real `TIMESTAMP` so `MIN`/`MAX` and range filters work. 6. How auto-detected column names drop the `$.` prefix and lowercase the leaf (so `$.first` becomes `first`). 7. How `file_metadata` injects `df_file_name` and `df_row_number` for per-row provenance. 8. How `ASSERT ROW_COUNT` and `ASSERT VALUE ... WHERE ...` build a self-verifying demo. ## Prerequisites A running DeltaForge catalog, permission to create zones and schemas, and the demo data file `customers.json` (200 records) in the directory passed as `{{data_path}}`. No Delta table, Iceberg catalog, or object-store credential is needed — the file is read directly from the local path.