REGEXP_SUBSTR

Return the substring that matches a regular expression (scalar result, unlike REGEXP_MATCH's array).

Category: regexReturns: VARCHARDialect: PostgreSql

Syntax

REGEXP_SUBSTR(str, pattern [, start [, occurrence [, flags]]])

Description

## Overview Extracts and returns the substring that matches the regular expression. Unlike REGEXP_MATCH (which returns an array), REGEXP_SUBSTR returns a plain scalar string, which makes it the most natural form for extracting a single token from a string column. By default, the first match starting from position 1 is returned. The optional `start` and `occurrence` arguments let you skip leading matches without rewriting the pattern. ## Behavior - Returns NULL when either the input string or the pattern is NULL. - Returns NULL when the pattern does not match or when the requested occurrence does not exist. - Positions are 1-based and count Unicode code points. - Occurrence is 1-based relative to `start`; 1 is the first match after `start`. - If the pattern has capturing groups, REGEXP_SUBSTR returns the entire match (not a specific group). Use REGEXP_MATCH when you need captured groups. - Pattern compilation errors raise an error at query time. - Operates on Unicode code points. ## Regex flavor - Rust regex crate, Perl-compatible for most common features. - Supports anchors, character classes, alternation, quantifiers, capturing groups, Unicode classes, and escape sequences. - Does NOT support backreferences in the pattern. - Does NOT support lookahead or lookbehind. - Flags: 'i' case-insensitive, 'c' case-sensitive (default), 'n'/'s' dot-matches-newline, 'm' multi-line. ## Compatibility - Matches the common SQL REGEXP_SUBSTR signature (str, pattern, start, occurrence, flags).

Parameters

NameTypeDescription
strSpecifies the input string to search.
patternSpecifies the regular expression pattern to match.
startSpecifies the 1-based character position at which to begin searching. Defaults to 1.
occurrenceSpecifies which occurrence of the pattern to extract. Defaults to 1 (first match). Occurrences are counted starting from `start`.
flagsSpecifies match flags: 'i' (case-insensitive), 'c' (case-sensitive), 'n'/'s' (dot matches newline), 'm' (multi-line).

Examples

-- First digit run
SELECT REGEXP_SUBSTR('abc123def456', '[0-9]+');  -- '123'
-- Second digit run
SELECT REGEXP_SUBSTR('abc123def456', '[0-9]+', 1, 2);  -- '456'
-- Start search from position 5
SELECT REGEXP_SUBSTR('abc123def456', '[0-9]+', 5);  -- '23'
-- No match returns NULL
SELECT REGEXP_SUBSTR('hello', '[0-9]+');  -- NULL
-- Case-insensitive extraction
SELECT REGEXP_SUBSTR('Error: Not Found', '[a-z]+', 1, 1, 'i');  -- 'Error'
-- NULL propagates
SELECT REGEXP_SUBSTR(CAST(NULL AS VARCHAR), 'x');  -- NULL

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →