FHIR XML Clinical Resources

Ingest native HL7 FHIR R4 XML resources (Patient, Observation), handling the default FHIR namespace, @value-attribute pattern, repeating elements, deep XPath nesting, and exclude_paths for narrative/meta.

Category: fhir

Syntax

-- 1. Zone + schema
CREATE ZONE IF NOT EXISTS {{zone_name}} TYPE EXTERNAL;
CREATE SCHEMA IF NOT EXISTS {{zone_name}}.fhir_xml
    COMMENT 'FHIR XML external tables: HL7 FHIR R4 resources in native XML';

-- 2. Patient XML, namespace stripped, @value extracted, narrative excluded
CREATE EXTERNAL TABLE IF NOT EXISTS {{zone_name}}.fhir_xml.patients_xml
USING XML
LOCATION '{{data_path}}'
OPTIONS (
    file_filter = 'patient-example*.xml',
    xml_flatten_config = '{
        "row_xpath": "//Patient",
        "include_paths": [
            "/Patient/id/@value", "/Patient/gender/@value",
            "/Patient/birthDate/@value",
            "/Patient/name/family/@value", "/Patient/name/given/@value",
            "/Patient/telecom/system/@value", "/Patient/telecom/value/@value",
            "/Patient/address/city/@value", "/Patient/address/country/@value",
            "/Patient/maritalStatus/coding/code/@value",
            "/Patient/maritalStatus/coding/display/@value",
            "/Patient/managingOrganization/display/@value",
            "/Patient/contact", "/Patient/communication"
        ],
        "exclude_paths": ["/Patient/text", "/Patient/meta"],
        "xml_paths": ["/Patient/contact", "/Patient/communication"],
        "default_repeat_handling": "join_comma",
        "column_mappings": {
            "/Patient/id/@value": "patient_id",
            "/Patient/gender/@value": "gender",
            "/Patient/birthDate/@value": "birth_date",
            "/Patient/name/family/@value": "family_name",
            "/Patient/name/given/@value": "given_name",
            "/Patient/address/city/@value": "city",
            "/Patient/maritalStatus/coding/code/@value": "marital_code",
            "/Patient/maritalStatus/coding/display/@value": "marital_display",
            "/Patient/managingOrganization/display/@value": "org_display",
            "/Patient/contact": "contact",
            "/Patient/communication": "communication"
        },
        "include_attributes": true,
        "separator": "_", "max_depth": 10,
        "strip_namespace_prefixes": true,
        "nested_output_format": "json",
        "namespaces": {
            "fhir": "http://hl7.org/fhir",
            "xhtml": "http://www.w3.org/1999/xhtml"
        }
    }',
    file_metadata = '{"columns":["df_file_name","df_row_number"]}'
);

-- 3. Observation XML, deep coding + component/referenceRange as JSON
CREATE EXTERNAL TABLE IF NOT EXISTS {{zone_name}}.fhir_xml.observations_xml
USING XML
LOCATION '{{data_path}}'
OPTIONS (
    file_filter = 'observation-example*.xml',
    xml_flatten_config = '{
        "row_xpath": "//Observation",
        "include_paths": [
            "/Observation/id/@value", "/Observation/status/@value",
            "/Observation/code/coding/system/@value",
            "/Observation/code/coding/code/@value",
            "/Observation/code/coding/display/@value",
            "/Observation/subject/reference/@value",
            "/Observation/effectiveDateTime/@value",
            "/Observation/valueQuantity/value/@value",
            "/Observation/valueQuantity/unit/@value",
            "/Observation/referenceRange", "/Observation/component"
        ],
        "exclude_paths": ["/Observation/text", "/Observation/meta"],
        "xml_paths": ["/Observation/referenceRange", "/Observation/component"],
        "default_repeat_handling": "join_comma",
        "column_mappings": {
            "/Observation/id/@value": "observation_id",
            "/Observation/status/@value": "status",
            "/Observation/code/coding/system/@value": "code_system",
            "/Observation/code/coding/code/@value": "code_value",
            "/Observation/code/coding/display/@value": "code_display",
            "/Observation/subject/reference/@value": "patient_ref",
            "/Observation/effectiveDateTime/@value": "effective_date",
            "/Observation/valueQuantity/value/@value": "result_value",
            "/Observation/valueQuantity/unit/@value": "result_unit",
            "/Observation/referenceRange": "reference_range",
            "/Observation/component": "component"
        },
        "strip_namespace_prefixes": true,
        "nested_output_format": "json",
        "namespaces": {"fhir": "http://hl7.org/fhir"}
    }',
    file_metadata = '{"columns":["df_file_name","df_row_number"]}'
);

-- 4. Cross-table FHIR reference join (Patient/{id})
SELECT o.observation_id, o.code_display, o.result_value, o.result_unit,
       p.family_name, p.given_name, p.gender
FROM {{zone_name}}.fhir_xml.observations_xml o
LEFT JOIN {{zone_name}}.fhir_xml.patients_xml p
    ON o.patient_ref = 'Patient/' || p.patient_id
ORDER BY o.observation_id;

-- 5. Cleanup (zone is shared, do not drop)
DROP EXTERNAL TABLE IF EXISTS {{zone_name}}.fhir_xml.patients_xml WITH FILES;
DROP EXTERNAL TABLE IF EXISTS {{zone_name}}.fhir_xml.observations_xml WITH FILES;
DROP SCHEMA IF EXISTS {{zone_name}}.fhir_xml;

Description

## When to Use Use this demo when your FHIR source system serves XML rather than JSON (common for EU HL7 pipelines, older HAPI FHIR deployments, and CDA-to-FHIR gateways). FHIR XML has a defining quirk: every primitive value is an XML ATTRIBUTE (`<gender value="male"/>`), not element text. This, combined with the mandatory `xmlns="http://hl7.org/fhir"` namespace and verbose `<text>` XHTML narratives, makes FHIR XML quite different from typical tag-and-text XML. This is the advanced demo in the series because it exercises nearly every knob of DeltaForge's `xml_flatten_config`. ## What You Will Learn 1. How `strip_namespace_prefixes: true` plus a `namespaces` map handles the HL7 FHIR default namespace so XPaths stay readable (`/Patient/id/@value`, not `/fhir:Patient/fhir:id/@value`). 2. How to extract the FHIR `@value` attribute pattern for every primitive, the core idiom of FHIR XML. 3. How `exclude_paths` strips the hundreds-of-characters `<text>` XHTML narrative and `<meta>` security tags that would otherwise dominate every row. 4. How `default_repeat_handling: "join_comma"` merges repeated `<name>`, `<telecom>`, `<identifier>` blocks into comma-separated columns for analyst-friendly reading. 5. How `xml_paths` preserves deeply variable subtrees (`<contact>`, `<communication>`, `<component>`, `<referenceRange>`) as JSON blobs via `nested_output_format: "json"`. 6. How to navigate 4-level-deep CodeableConcept XPaths (`/Observation/code/coding/code/@value`) for LOINC / SNOMED extraction. 7. How to JOIN two FHIR XML tables on the FHIR reference pattern `Patient/{id}` to build a cross-resource cohort view. ## Prerequisites - Completion of `fhir-patient-demographics` (JSON baseline), this demo is the XML counterpart and assumes familiarity with FHIR resources. - Basic XPath knowledge (`/`, `//`, `@attr`, element navigation). - Understanding of the FHIR CodeableConcept shape (`code > coding > system|code|display`) and Reference shape (`reference > @value`).

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →