Return capturing groups from every regex match as a set of rows (set-returning function).
REGEXP_MATCHES(str, pattern [, flags])
## Overview A set-returning function that produces one row per non-overlapping match of the pattern in the input string. Each row is an array of the captured groups (or a one-element array containing the whole match when the pattern has no capturing groups). Without the 'g' flag, REGEXP_MATCHES emits at most one row. REGEXP_MATCHES is the right tool when you want to unnest structured tokens from free-text into rows: all SKU references in a notes field, all hashtags in a description, all key/value pairs in a packed string. Pair it with LATERAL to walk matches per source row. ## Behavior - Produces no rows when either argument is NULL. - Produces no rows when the pattern does not match anywhere. - Without 'g', produces at most one row. - With 'g', produces one row per non-overlapping match. - Each row is an ARRAY of STRING values: one element per capturing group, or one element containing the full match when there are no groups. - A non-participating optional group appears as NULL in the array. - Zero-width matches advance one character at a time to ensure termination. - Operates on Unicode code points. ## Regex flavor - Rust regex crate, Perl-compatible for most common features. - Supports anchors, character classes, alternation, quantifiers, capturing groups, Unicode classes. - Does NOT support backreferences in the pattern. - Does NOT support lookahead or lookbehind. - Flags: 'g' (global, multiple rows), 'i' case-insensitive, 'c' case-sensitive (default), 'n'/'s' dot-matches-newline, 'm' multi-line. ## Compatibility - Matches the common SQL REGEXP_MATCHES semantics including the 'g' flag behaviour.
| Name | Type | Description |
|---|---|---|
str | Specifies the input string to search. | |
pattern | Specifies the regular expression pattern. Capturing groups define the elements returned in each result array. | |
flags | Specifies match flags. Include 'g' to return all matches as multiple rows. Other flags: 'i' (case-insensitive), 'c' (case-sensitive), 'n'/'s' (dot matches newline), 'm' (multi-line). |
-- Without 'g' flag: at most one row from the first match
SELECT REGEXP_MATCHES('abc 123 def 456', '(\d+)'); -- one row: ['123']
-- With 'g' flag: one row per match
SELECT REGEXP_MATCHES('abc 123 def 456', '(\d+)', 'g'); -- two rows: ['123'], ['456']
-- Multiple capturing groups per match
SELECT REGEXP_MATCHES('John:30, Jane:25', '(\w+):(\d+)', 'g'); -- rows: ['John', '30'], ['Jane', '25']
-- No matches produces zero rows
SELECT REGEXP_MATCHES('hello', '(\d+)', 'g'); -- no rows
-- Use in a FROM clause with LATERAL
SELECT o.order_id, m.captures
FROM retail.sales.orders o,
LATERAL REGEXP_MATCHES(o.notes, '(SKU-\d+)', 'g') AS m(captures);
-- Flatten an array result to its first element for a compact display
SELECT (REGEXP_MATCHES('a=1;b=2;c=3', '([a-z])=(\d)', 'g'))[1] AS key_name;