Return the captured groups (or the whole match) from the first regex match as an array.
REGEXP_MATCH(str, pattern [, flags])
## Overview Returns the capturing groups (or the whole match, if the pattern has no groups) from the first match of the pattern in the input string, as an array of strings. If the pattern does not match at all, REGEXP_MATCH returns NULL (not an empty array). Use REGEXP_MATCH to extract structured components from a string: date parts, version numbers, host/path pairs, key/value fragments. For all matches rather than just the first, use REGEXP_MATCHES with the 'g' flag; for a single-scalar extraction use REGEXP_SUBSTR. ## Behavior - Returns NULL when either argument is NULL. - Returns NULL when the pattern does not match. - When the pattern has capturing groups, the result array contains one element per group, in left-to-right order. - When a group did not participate in the match (for example because it is inside an untaken alternation branch), the corresponding array element is NULL. - When the pattern has no capturing groups, the result is a one-element array containing the whole matched substring. - REGEXP_MATCH finds only the first match. For all matches, use REGEXP_MATCHES with 'g'. - The 'g' flag is NOT supported here (see REGEXP_MATCHES). - Operates on Unicode code points. ## Regex flavor - Rust regex crate, Perl-compatible for most common features. - Supports anchors, character classes, alternation, quantifiers, capturing groups (numbered and named), Unicode classes, and escape sequences. - Does NOT support backreferences in the pattern. - Does NOT support lookahead or lookbehind. - Flags: 'i' case-insensitive, 'c' case-sensitive (default), 'n'/'s' dot-matches-newline, 'm' multi-line. ## Compatibility - Matches the common SQL REGEXP_MATCH semantics.
| Name | Type | Description |
|---|---|---|
str | Specifies the input string to search. | |
pattern | Specifies the regular expression pattern. When the pattern has capturing groups, the result contains one array element per group. Without capturing groups, the result contains the entire match. | |
flags | Specifies match flags: 'i' (case-insensitive), 'c' (case-sensitive), 'n'/'s' (dot matches newline), 'm' (multi-line). |
-- Three capturing groups for year/month/day
SELECT REGEXP_MATCH('2025-04-03', '(\d{4})-(\d{2})-(\d{2})'); -- ['2025', '04', '03']
-- No capturing groups: array contains the whole match
SELECT REGEXP_MATCH('hello world', '\w+'); -- ['hello']
-- No match returns NULL
SELECT REGEXP_MATCH('hello', '[0-9]+'); -- NULL
-- Case-insensitive match with two capture groups
SELECT REGEXP_MATCH('Error: 404', '(error):\s*(\d+)', 'i'); -- ['Error', '404']
-- NULL propagates
SELECT REGEXP_MATCH(CAST(NULL AS VARCHAR), 'x'); -- NULL
-- Extract host and path from a URL column
SELECT url, REGEXP_MATCH(url, '^https?://([^/]+)(/.*)?$') AS parts
FROM ops.etl.staging_links;