Repeating Segments & Loops: Indexed / Concatenate / ToJson Modes

Three external tables over the same 14 X12 files show how DeltaForge materializes multi-occurrence segments (N1 party loops, PO1 line items): indexed mode fans each occurrence into numbered columns, concatenate mode pipe-delimits values in the default columns, to_json mode encodes them as JSON arrays.

Category: edi

Syntax

-- Mode 1: INDEXED, one column per occurrence
CREATE EXTERNAL TABLE IF NOT EXISTS external.edi_demos.repeating_indexed
USING EDI
LOCATION '/path/to/edi/*.edi'
OPTIONS (
    edi_config = '{
        "ediFormat": "x12",
        "materialized_paths": [
            "n1_1", "n1_2",
            "po1_1", "po1_2", "po1_3", "po1_4"
        ],
        "max_repeating_segments": 6,
        "repeating_segment_mode": "indexed"
    }',
    file_metadata = '{"columns":["df_file_name","df_row_number"]}'
);

-- Mode 2: CONCATENATE, pipe-delimited single column
CREATE EXTERNAL TABLE IF NOT EXISTS external.edi_demos.repeating_concat
USING EDI
LOCATION '/path/to/edi/*.edi'
OPTIONS (
    edi_config = '{
        "ediFormat": "x12",
        "materialized_paths": [
            "n1_1", "n1_2",
            "po1_1", "po1_2", "po1_3", "po1_4"
        ],
        "max_repeating_segments": 6,
        "repeating_segment_mode": "concatenate"
    }',
    file_metadata = '{"columns":["df_file_name","df_row_number"]}'
);

-- Mode 3: TO_JSON: JSON array per column
CREATE EXTERNAL TABLE IF NOT EXISTS external.edi_demos.repeating_json
USING EDI
LOCATION '/path/to/edi/*.edi'
OPTIONS (
    edi_config = '{
        "ediFormat": "x12",
        "materialized_paths": [
            "n1_1", "n1_2",
            "po1_1", "po1_2", "po1_3", "po1_4"
        ],
        "max_repeating_segments": 6,
        "repeating_segment_mode": "to_json"
    }',
    file_metadata = '{"columns":["df_file_name","df_row_number"]}'
);

-- Query A, first two N1 parties via INDEXED columns
SELECT df_file_name, n1_1_1, n1_1_2, n1_2_1, n1_2_2
FROM external.edi_demos.repeating_indexed
ORDER BY df_file_name;

-- Query B, all party names concatenated in one cell
SELECT df_file_name, n1_2 AS party_names_piped
FROM external.edi_demos.repeating_concat
ORDER BY df_file_name;

-- Query C, party count per transaction via JSON
SELECT
    df_file_name,
    json_array_length(n1_2::json) AS party_count
FROM external.edi_demos.repeating_json
ORDER BY party_count DESC;

Description

## When to Use Use this demo when your X12 or EDIFACT segments repeat: N1 party loops, PO1/IT1 line items, HL hierarchical levels, SAC allowances, TXI tax rows, LIN lines. A single materialized path like `n1_2` only captures the first occurrence by default; this demo shows the three `repeating_segment_mode` strategies and when to pick each. - **indexed**, each occurrence becomes its own column (`n1_1_1`, `n1_1_2`, ..., `n1_6_2`). Best when the max occurrence count is small and stable and you want columnar aggregation per occurrence. - **concatenate**, all occurrences pipe-delimited into the default column (`n1_2 = 'Aaron|XYZ Bank|...'`). Best for display / full-text search / string functions. - **to_json**, occurrences as a JSON array (`n1_2 = '["Aaron","XYZ Bank"]'`). Best for analytic queries with `json_array_length`, `json_extract_path_text`, or downstream unnest. ## What You Will Learn 1. How `max_repeating_segments` caps the number of occurrences the parser tracks (default is low, raise it for line-item-heavy files) 2. How `repeating_segment_mode = 'indexed' | 'concatenate' | 'to_json'` changes the on-disk column layout without re-parsing source files 3. How indexed mode generates `{segment}_{occurrence}_{element}` column names (all 1-based) 4. How concatenate uses a literal pipe `|` as the delimiter, escape pipes in source data before ingest if they conflict 5. How to_json integrates with `json_array_length` and `json_extract_path_text` for dynamic line-item analysis 6. How to pick between modes for a given downstream consumer (BI tool columns vs. SQL aggregation vs. API payload) ## Prerequisites - X12 .edi files with at least one repeating loop (N1 parties or PO1 lines) - Familiarity with the basic `materialized_paths` pattern (see demo_edi_x12_supply_chain_basics)

Pitfalls

`max_repeating_segments` is a hard cap enforced BEFORE the `repeating_segment_mode` dispatch, occurrences beyond it never reach the mode arm. With the default value of 1, every mode (last/indexed/concatenate/to_json) silently collapses to first-occurrence-only output. Always raise it explicitly to the 99th-percentile expected count when you use a non-first mode.
Default `repeating_segment_mode` is `last`. Demos that expect the first-occurrence value (e.g. the payer NM1, the buyer NAD) must declare `"repeating_segment_mode": "first"` explicitly, otherwise the last NM1/NAD overwrites earlier ones once `max_repeating_segments` is raised.
Indexed mode column names are 1-based for BOTH occurrence and element: `n1_1_1` is (1st N1, element 1), `n1_2_1` is (2nd N1, element 1). Off-by-one bugs are the most common mistake, always read `{segment}_{occurrence}_{element}`.
Concatenate mode hard-codes `|` as the delimiter; a party name that legitimately contains a pipe will look like two parties. Inspect source data for literal pipes before using concatenate.
to_json emits STRING columns, cast with `n1_2::json` or `n1_2::jsonb` before applying json_* functions; raw JSON functions on a STRING return NULL.
Different repeating-segment modes cannot coexist in one table; you need a separate CREATE EXTERNAL TABLE per mode (as in this demo).
The three modes only affect materialized_paths, df_transaction_json always contains the full parsed transaction regardless of mode, so it is the fallback for occurrences beyond max_repeating_segments.
Implicit X12 loops (like the 850 N1 loop with N2/N3/N4/REF/PER children) collapse into multiple N1_x rows in indexed mode only if you materialize the child segments too, materializing `n1_2` alone gives party names but drops addresses unless you add `n3_1`, `n4_1`, etc.

Repeating Segments & Loops: Indexed / Concatenate / ToJson Modes

Syntax

Description

Pitfalls

See Also