UN/EDIFACT

Parse UN/EDIFACT (ISO 9735) international EDI messages for trade, transport, and customs

Category: edi

Description

## Overview UN/EDIFACT (United Nations Electronic Data Interchange for Administration, Commerce, and Transport) is the international EDI standard defined by ISO 9735. It serves as the primary electronic messaging format for cross-border trade, logistics, customs declarations, and financial transactions across Europe, Asia, and much of the global supply chain. DeltaForge reads EDIFACT files through the unified EDI format handler. The format identifier in SQL is `EDI`, and the `edi_config` option with `ediFormat` set to `edifact` selects the EDIFACT dialect. The parser recognizes the hierarchical envelope model: an optional UNA service string declares delimiter characters, followed by the interchange envelope (UNB/UNZ), optional functional groups (UNG/UNE), and individual messages (UNH/UNT). ## Usage EDIFACT files are registered as external tables using `CREATE EXTERNAL TABLE` with the `EDI` format and an `edi_config` option specifying the EDIFACT dialect: ```sql CREATE EXTERNAL TABLE IF NOT EXISTS zone.edi_demos.edifact_messages USING EDI LOCATION '{{data_path}}/*.edi' OPTIONS ( edi_config = '{"ediFormat": "edifact"}', file_metadata = '{"columns":["df_file_name","df_row_number"]}' ); ``` Once the external table is created, query it with standard SQL: ```sql SELECT UNB_2, UNB_3, UNH_2, df_transaction_json FROM zone.edi_demos.edifact_messages WHERE UNH_2 LIKE 'ORDERS%'; ``` Use JSON functions on the `df_transaction_json` column for deep segment-level access to message content beyond the default envelope columns. ## Output Schema Each EDIFACT message produces one row. The default output columns are: | Column | Description | |---|---| | UNB_1 | Syntax identifier (e.g., UNOC:3) | | UNB_2 | Interchange sender identification | | UNB_3 | Interchange recipient identification | | UNB_4 | Date and time of preparation | | UNB_5 | Interchange control reference | | UNH_1 | Message reference number | | UNH_2 | Message identifier (e.g., ORDERS:D:96A:UN) | | df_transaction_json | Full message content as a JSON string | | df_transaction_id | Unique deterministic hash identifying the transaction | When `file_metadata` is configured, additional columns such as `df_file_name` and `df_row_number` are appended. ## Key Options - **edi_config**: JSON string with `ediFormat` set to `edifact`. This is the primary option that selects the EDIFACT dialect within the unified EDI reader. - **file_metadata**: JSON string specifying which system columns to inject, e.g., `{"columns":["df_file_name","df_row_number"]}`. - **strict**: Validates that UNZ/UNE/UNT trailer counts match their corresponding headers. Disable for lenient parsing of malformed files. ## Supported Standards DeltaForge supports multiple UN/EDIFACT directory versions including D96A, D01B, D03B, and D95B. EANCOM messages (a GS1 subset of EDIFACT used in retail supply chains) are also supported, including DESADV, IFTSTA, INVOIC, ORDRSP, PRICAT, and IFTMIN. Common EDIFACT message types include ORDERS (purchase orders), INVOIC (invoices), DESADV (despatch advice), CUSCAR (customs cargo), and IFTMIN (transport instructions).

See Also

Open in interactive docs →   DeltaForge home →