Declares a pipeline with metadata, scheduling, and governance attributes.
PIPELINE <name>
[DESCRIPTION '<desc>']
[SCHEDULE '<cron>']
[TIMEZONE '<tz>']
[TAGS '<tag1>', '<tag2>', ...]
[SLA <hours>]
[FAIL_FAST true|false]
[APPROVAL REQUIRED]
[STATUS '<status>']
[LIFECYCLE '<stage>']
[DEFAULTS ($param = '<value>', ...)]
## Overview Declares a first-class pipeline object in the DeltaForge catalog. The PIPELINE command is placed at the top of a .sql file and defines metadata, scheduling, governance, and default parameters for all SQL statements that follow in the same file. When DeltaForge scans a workspace, files containing a PIPELINE declaration are registered as pipelines in the catalog. Files without a PIPELINE declaration are registered as scripts. ## Behavior - The PIPELINE command is a declarative header. It does not execute SQL; it configures the pipeline's metadata in the catalog. - Clauses can appear in any order after the pipeline name. The parser processes them until it encounters a semicolon or the end of the clause block. - When a SCHEDULE cron expression is provided inline, DeltaForge creates an implicit schedule object linked to the pipeline. For more control over scheduling attributes (retries, timeouts, notifications), use a separate SCHEDULE command. - The STATUS clause synchronizes the catalog record with the declared value on every scan or save operation. Omitting STATUS leaves the catalog status under GUI control. - FAIL_FAST without an explicit true/false value defaults to true. - DEFAULTS parameters use dollar-sign prefixed names ($param) and are available for variable substitution throughout the pipeline's SQL statements. - The pipeline name supports dotted notation for workspace-scoped naming (e.g., workspace.pipeline_name), where the first segment maps to the workspace context. ## Access Control | Privilege | Object | Notes | |-----------|--------|-------| | Pipeline management | Workspace | The user must have access to the workspace containing the pipeline file. | ## Compatibility PIPELINE is a DeltaForge extension with no equivalent in standard SQL. It is parsed as a first-class command by the DeltaForge SQL parser (not as a comment-based directive).
| Name | Type | Description |
|---|---|---|
name | Specifies the pipeline name. This is the unique identifier used to reference the pipeline in the catalog. Supports dotted notation (e.g., workspace.pipeline_name) for workspace-scoped pipelines. | |
description | Specifies a human-readable description of the pipeline's purpose. Displayed in the catalog and workspace detail views. | |
schedule | Specifies a cron expression or a reference to a named SCHEDULE object. When a cron expression is provided, DeltaForge automatically creates a corresponding schedule object. Valid cron expressions follow the standard five-field format (minute, hour, day-of-month, month, day-of-week). | |
timezone | Specifies the timezone for schedule evaluation. Uses IANA timezone identifiers (e.g., 'America/New_York', 'Europe/London'). Default: UTC. | |
tags | Specifies one or more tags for organizational grouping. Tags are comma-separated quoted strings. Useful for filtering and categorizing pipelines in the workspace. | |
sla_hours | Specifies the Service Level Agreement target in hours. When a pipeline run exceeds this duration, it is flagged as an SLA breach. Accepts decimal values (e.g., 2.5 for two and a half hours). | |
fail_fast | Controls whether the pipeline stops execution on the first statement failure. When true, all remaining statements are skipped after a failure. When false or omitted, the pipeline continues executing subsequent statements. The keyword alone (without true/false) is equivalent to true. | |
approval_required | Enables the approval gate for pipeline runs. When set, scheduled runs enter a pending state and require manual approval before execution begins. Specified as APPROVAL REQUIRED (two keywords). | |
status | Specifies the operational status of the pipeline. Valid values: ACTIVE, PAUSED, DISABLED, DRAFT, ARCHIVED. When set in SQL, the catalog record status is updated to match on every scan or save. When omitted, the status is managed through the GUI. | |
lifecycle | Specifies the lifecycle stage of the pipeline. Valid values: DEVELOPMENT (or DEV), TESTING (or TEST), STAGING, PRODUCTION (or PROD), DEPRECATED, ARCHIVED. Used for governance and environment-aware execution. | |
defaults | Specifies default parameter values for the pipeline. Each parameter is declared as $name = value inside parentheses. Values can be string literals, integers, floats, or booleans. These defaults are used when no override is provided at execution time. |
-- Minimal pipeline declaration
PIPELINE my_etl_pipeline;
-- Pipeline with scheduling and basic metadata
PIPELINE my_etl_pipeline
DESCRIPTION 'Daily ETL for customer data'
SCHEDULE '0 6 * * *'
TIMEZONE 'America/New_York'
TAGS 'etl', 'daily'
SLA 2.0
FAIL_FAST true;
-- Production pipeline with approval gate and lifecycle
PIPELINE billing_reconciliation
DESCRIPTION 'Monthly billing reconciliation against ERP'
SCHEDULE '0 2 1 * *'
TIMEZONE 'Europe/London'
LIFECYCLE PRODUCTION
STATUS ACTIVE
APPROVAL REQUIRED
SLA 4.0
TAGS 'billing', 'finance', 'monthly';
-- Pipeline with default parameters
PIPELINE incremental_load
DESCRIPTION 'Incremental load with configurable lookback'
SCHEDULE '0 */4 * * *'
DEFAULTS ($lookback_days = '7', $batch_size = 10000, $dry_run = false)
FAIL_FAST true
TAGS 'incremental';
-- Draft pipeline in development
PIPELINE experimental_ml_features
DESCRIPTION 'Feature engineering pipeline for ML model'
LIFECYCLE DEVELOPMENT
STATUS DRAFT
TAGS 'ml', 'experimental';