REGEXP_COUNT

Return the number of non-overlapping matches of a regular expression in the input string.

Category: regexReturns: INTEGERDialect: PostgreSql

Syntax

REGEXP_COUNT(str, pattern [, start [, flags]])

Description

## Overview Counts the number of non-overlapping matches of the regular expression within the input string, starting from the specified position. After each match, scanning continues from the character immediately following the match, so overlapping occurrences are not double-counted. REGEXP_COUNT is the natural tool for quality-control summaries ('how many email addresses appear in this notes column?', 'how many digit runs are in this free-text field?') and for simple tokenisation counts. ## Behavior - Returns NULL when the input string or the pattern is NULL. - Returns 0 when no matches are found; does not return NULL for 'no match'. - The start argument is 1-based; values less than 1 are treated as 1. Values greater than the string length return 0. - Zero-width matches advance one character at a time so the function terminates. - Pattern compilation errors raise an error at query time, not silently. - Operates on Unicode code points; multi-byte UTF-8 characters match predictably. ## Regex flavor - Uses the Rust regex crate, which is Perl-compatible for most common features (character classes, alternation, quantifiers, anchors, capturing groups, backslash escapes). - Supports the usual anchors: `^` start of string (or line in multi-line mode), `$` end of string (or line). - Supports Unicode character classes via `\p{L}`, `\p{N}`, etc. - Does NOT support backreferences in the pattern (for example `\1` to re-match a captured group); the engine is guaranteed linear time. - Does NOT support lookahead (`(?=...)`) or lookbehind (`(?<=...)`). - Flags: 'i' case-insensitive, 'n' or 's' dot-matches-newline, 'm' multi-line, 'c' case-sensitive (default). ## Compatibility - Matches the common SQL REGEXP_COUNT semantics for its 2/3/4-argument forms. - Returns an INTEGER, not BIGINT, so very long pattern-dense inputs could theoretically overflow on extreme cases; in practice this is not a concern.

Parameters

NameTypeDescription
strSpecifies the input string to search. Returns NULL if this argument is NULL.
patternSpecifies the regular expression pattern. Returns NULL if this argument is NULL. Invalid patterns raise an error.
startSpecifies the 1-based character position at which to begin searching. Defaults to 1 (start of string). Values greater than the string length produce a result of 0.
flagsSpecifies match flags that modify the regex behaviour. Supported flags: 'i' (case-insensitive), 'c' (case-sensitive, the default), 'n' (dot matches newline), 's' (single-line mode, dot matches newline, synonym for 'n'), 'm' (multi-line mode: ^ and $ match at line boundaries).

Examples

-- Count occurrences of a literal character
SELECT REGEXP_COUNT('banana', 'a');  -- 3
-- Count digit sequences (non-overlapping)
SELECT REGEXP_COUNT('abc123def456', '[0-9]+');  -- 2
-- Case-insensitive matching
SELECT REGEXP_COUNT('Hello World HELLO', 'hello', 1, 'i');  -- 2
-- Start searching from position 3
SELECT REGEXP_COUNT('aaa', 'a', 3);  -- 1
-- No matches returns 0, not NULL
SELECT REGEXP_COUNT('hello', 'xyz');  -- 0
-- NULL in input propagates
SELECT REGEXP_COUNT(CAST(NULL AS VARCHAR), 'a');  -- NULL

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →