Parse FHIR resources from RDF/Turtle format for semantic web and linked data workflows
## Overview FHIR RDF/Turtle is the RDF (Resource Description Framework) serialization of FHIR resources using Turtle (Terse RDF Triple Language) syntax. This format bridges the gap between clinical healthcare data and semantic web technologies, enabling FHIR resources to participate in linked data graphs, ontology-based reasoning, and SPARQL query ecosystems. FHIR RDF is particularly valuable for research institutions, terminology services, and knowledge management platforms that integrate clinical data with biomedical ontologies such as SNOMED CT, LOINC, and ICD. DeltaForge reads FHIR RDF files using the `RDF` format handler. The parser processes Turtle prefix declarations, subject-predicate-object triples, and blank nodes to reconstruct FHIR resource instances from the RDF graph. Each resource identified by its subject URI and fhir:nodeRole predicate is extracted as an independent entity. ## Usage FHIR RDF files are registered as external tables using `CREATE EXTERNAL TABLE` with the `RDF` format: ```sql CREATE EXTERNAL TABLE IF NOT EXISTS zone.fhir_demos.terminology_rdf USING RDF LOCATION '{{data_path}}' OPTIONS ( file_filter = '*.ttl', resource_types = '["CodeSystem", "ValueSet"]', file_metadata = '{"columns":["df_file_name","df_row_number"]}' ); ``` For patient clinical data in RDF format: ```sql CREATE EXTERNAL TABLE IF NOT EXISTS zone.fhir_demos.patients_rdf USING RDF LOCATION '{{data_path}}/patients/' OPTIONS ( file_filter = '*.ttl', file_metadata = '{"columns":["df_file_name","df_row_number"]}' ); ``` Once the external table is created, query it with standard SQL: ```sql SELECT * FROM zone.fhir_demos.patients_rdf LIMIT 10; ``` ## Output Schema Each FHIR resource found in the RDF graph produces one row. RDF predicates are mapped to column names using the FHIR property paths derived from the predicate URIs. Literal values become string columns, while nested blank node structures are serialized as JSON strings. Because RDF is inherently a graph model rather than a tree, the reader performs graph traversal from each resource root to collect all associated triples into a coherent row. When `file_metadata` is configured, additional columns such as `df_file_name` and `df_row_number` are appended. ## Key Options - **resource_types**: JSON array string restricting output to specific FHIR resource types. Useful for filtering terminology resources (CodeSystem, ValueSet) from clinical resources (Patient, Observation). - **file_filter**: Glob pattern to filter files within the LOCATION directory (e.g., `*.ttl`). - **file_metadata**: JSON string specifying which system columns to inject. - **error_handling**: Set to 'strict' for RDF graph integrity validation, or 'lenient' (default) to skip malformed triples. ## Use Cases FHIR RDF is especially useful for: - Terminology management: loading CodeSystem and ValueSet resources for ontology-based queries - Research data integration: joining clinical FHIR data with external knowledge graphs - Linked data pipelines: connecting healthcare resources with biomedical ontologies FHIR RDF data is subject to the same HIPAA and data protection regulations as other FHIR serializations. DeltaForge preserves resource URIs, provenance triples, and all namespace-qualified content to maintain semantic fidelity.