Pipelines

Manage and monitor multi-step SQL data pipelines imported from the workspace repository.

Category: workflow

Description

## Overview The Pipelines page is the primary interface for managing and monitoring multi-step SQL data pipelines that transform and load data across the catalog. Pipelines are registered exclusively by scanning the workspace's git repository: any .sql file carrying a PIPELINE declaration is imported as a pipeline. The workspace is managed by editing files in git and rescanning; DeltaForge owns execution and scheduling, not authoring, so there is no create-pipeline button. Data engineers use this page to manage pipeline lifecycle states, review run history, associate pipelines with cron schedules, and remove pipelines from the catalog. Each pipeline belongs to a workspace and is identified by its file path. Pipelines progress through lifecycle states (draft, active, paused, archived) and support approval workflows that gate promotion from draft to active. The run history panel provides a direct link to the Executions page for detailed step-level diagnostics. ## Key Features - **Repository-scanned registry.** Every pipeline corresponds to a .sql file with a PIPELINE declaration discovered by the workspace repository scan. The git file is the source of truth; a rescan refreshes the catalog entry from the file. Pipeline SQL must be idempotent so that re-execution produces consistent results without manual intervention. - **Lifecycle management (activate, pause, resume, archive).** Transition pipelines through well-defined states. Activate a draft pipeline to make it eligible for scheduled runs. Pause a running pipeline to temporarily suspend execution. Archive pipelines that are no longer needed without deleting their history. - **Run history.** View a chronological list of all runs for a specific pipeline, including status, duration, trigger type, and outcome. Click through to the Executions page for step-level logs and error details. - **Approval workflows.** Require one or more approvals before a pipeline transitions from draft to active. Approvers review the pipeline SQL, dependencies, and target tables before granting activation. - **Schedule association.** Link a pipeline to one or more named schedules defined on the Schedules page. Schedules are independent, reusable objects; a single schedule can drive multiple pipelines, and a pipeline can be associated with multiple schedules. - **Deletion with persisted import exclusions.** Delete a single pipeline from its row actions, or select multiple rows with the checkboxes and delete them together. Deletion removes the catalog entry, its run history, and any pipeline-owned schedule, and records the source file in the repository's import exclusions so the next scan (manual, refresh, or the pull-and-rescan sync) does not resurrect it. The .sql file itself stays untouched in git; re-selecting it in the workspace scan wizard clears the exclusion and imports it again. ## Workflow 1. Author or change pipeline .sql files (each carrying a PIPELINE declaration) in the workspace's git repository, then run the repository scan from the workspace page to import them. 2. Navigate to the Pipelines page from the Workflow sidebar and review the imported pipelines, including detected dependencies and target tables. 3. Submit a pipeline for approval if approval workflows are enabled. 4. After approval, activate the pipeline and associate it with a schedule. 5. Monitor subsequent runs from the run history panel or the Executions page. 6. Pause or archive the pipeline when it is temporarily out of service, or delete it (single or multi-select) to remove it from the catalog while keeping the file in git, excluded from future scans.

Pipelines

Description

See Also