Excel Options Testbench

Validate each Excel connector option (sheet_name, has_header, skip_rows, max_rows, file_filter) with adversarially-shaped workbooks so misparses fail loudly.

Category: excel

Syntax

-- Zone + schema
CREATE ZONE IF NOT EXISTS {{zone_name}} TYPE EXTERNAL;
CREATE SCHEMA IF NOT EXISTS {{zone_name}}.excel_opts;

-- sheet_name, target a named sheet in a multi-sheet workbook
CREATE EXTERNAL TABLE IF NOT EXISTS {{zone_name}}.excel_opts.opt_sheet_name
USING EXCEL LOCATION '{{data_path}}'
OPTIONS (file_filter = '01_multi_sheet*', sheet_name = 'Details');

-- has_header, read a sheet that has no header row
CREATE EXTERNAL TABLE IF NOT EXISTS {{zone_name}}.excel_opts.opt_no_header
USING EXCEL LOCATION '{{data_path}}'
OPTIONS (file_filter = '02_no_header*', has_header = 'false');

-- skip_rows, skip N data rows after the header
CREATE EXTERNAL TABLE IF NOT EXISTS {{zone_name}}.excel_opts.opt_skip_rows
USING EXCEL LOCATION '{{data_path}}'
OPTIONS (file_filter = '03_skip_rows*', skip_rows = '3', has_header = 'true');

-- max_rows, cap the per-file row count
CREATE EXTERNAL TABLE IF NOT EXISTS {{zone_name}}.excel_opts.opt_max_rows
USING EXCEL LOCATION '{{data_path}}'
OPTIONS (file_filter = '04_max_rows*', sheet_name = 'Inventory', max_rows = '5');

-- file_filter, glob-match filenames inside LOCATION
CREATE EXTERNAL TABLE IF NOT EXISTS {{zone_name}}.excel_opts.opt_file_filter
USING EXCEL LOCATION '{{data_path}}'
OPTIONS (file_filter = '05_target*', sheet_name = 'Data');

-- Verify each option took effect
ASSERT ROW_COUNT = 5
SELECT id, name, score FROM {{zone_name}}.excel_opts.opt_sheet_name ORDER BY id;

ASSERT ROW_COUNT = 5
SELECT column_0, column_1, column_2, column_3 FROM {{zone_name}}.excel_opts.opt_no_header ORDER BY column_0;

ASSERT ROW_COUNT = 5
ASSERT VALUE project = 'Delta' WHERE id = 4
SELECT id, project, hours, status FROM {{zone_name}}.excel_opts.opt_skip_rows ORDER BY id;

ASSERT ROW_COUNT = 5
SELECT * FROM {{zone_name}}.excel_opts.opt_max_rows;

ASSERT ROW_COUNT = 3
ASSERT VALUE value = 'CORRECT' WHERE id = 1
SELECT id, value FROM {{zone_name}}.excel_opts.opt_file_filter ORDER BY id;

-- Cleanup
DROP EXTERNAL TABLE IF EXISTS {{zone_name}}.excel_opts.opt_sheet_name WITH FILES;
DROP EXTERNAL TABLE IF EXISTS {{zone_name}}.excel_opts.opt_no_header WITH FILES;
DROP EXTERNAL TABLE IF EXISTS {{zone_name}}.excel_opts.opt_skip_rows WITH FILES;
DROP EXTERNAL TABLE IF EXISTS {{zone_name}}.excel_opts.opt_max_rows WITH FILES;
DROP EXTERNAL TABLE IF EXISTS {{zone_name}}.excel_opts.opt_file_filter WITH FILES;

Description

## When to Use Reach for this demo when you need a ground-truth check that the Excel reader is honouring every option you pass in OPTIONS(...). Each of the five external tables is paired with a workbook whose contents are deliberately shaped so that an ignored option produces an obviously wrong row count or column set, making silent misconfiguration impossible. ## What You Will Learn 1. How `sheet_name` selects a specific sheet from a multi-sheet workbook and why omitting it falls back to the first sheet. 2. How `has_header = 'false'` produces `column_0`..`column_N` identifiers instead of consuming row 1 as a header. 3. How `skip_rows = '3'` drops the first three data rows after the header, not three rows starting from the top. 4. How `max_rows = '5'` caps per-file reads (handy for sampling large workbooks). 5. How `file_filter` glob-matches filenames inside LOCATION directories so decoy files are excluded. 6. How to write PASS/FAIL verification queries with `UNION ALL` + `CASE` expressions backed by ASSERT. ## Prerequisites - A zone (the setup creates an EXTERNAL zone if missing). - Read access to a directory containing the six adversarial XLSX fixtures shipped with this demo. - Familiarity with `CREATE EXTERNAL TABLE ... USING EXCEL` and the ASSERT framework.

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →