SUBSTRING_INDEX

Return the portion of a string before or after the nth occurrence of a delimiter.

Category: stringReturns: VARCHARDialect: PostgreSql

Syntax

SUBSTRING_INDEX(str, delimiter, count)

Description

## Overview Returns the portion of the input string determined by the nth occurrence of the delimiter. A positive `count` returns everything before the nth delimiter (counting from the left); a negative `count` returns everything after the nth delimiter (counting from the right). SUBSTRING_INDEX is convenient for extracting leading or trailing segments of dotted or slash-separated identifiers, URLs, and hostnames. When the delimiter does not appear `count` times, the function returns the entire string. This 'clamping' behaviour is usually what you want (for example, extracting the top-level domain of a single-label host falls back to the whole name). ## Behavior - Returns NULL when any argument is NULL. - Returns the entire input when the delimiter is not present or when |count| exceeds the number of occurrences. - Positive count returns everything before the nth delimiter (counting from left to right). - Negative count returns everything after the nth delimiter from the right (so -1 returns everything after the last delimiter, -2 after the second-to-last, etc.). - Zero count returns an empty string. - The delimiter is literal, not a pattern. - Operates on Unicode code points. ## Compatibility - Matches the conventional SUBSTRING_INDEX semantics (originated in MySQL, also common in DuckDB).

Parameters

NameTypeDescription
strSpecifies the input string to search.
delimiterSpecifies the literal delimiter to locate within the string.
countSpecifies which delimiter occurrence defines the cut. Positive values count from the left and return everything before the nth delimiter. Negative values count from the right and return everything after the nth delimiter from the end.

Examples

-- Everything before the second delimiter (counting left)
SELECT SUBSTRING_INDEX('a,b,c,d', ',', 2);  -- 'a,b'
-- Everything after the last delimiter (counting right)
SELECT SUBSTRING_INDEX('a,b,c,d', ',', -1);  -- 'd'
-- Get the two-label TLD from a host
SELECT SUBSTRING_INDEX('www.example.com', '.', -2);  -- 'example.com'
-- Delimiter not present: entire string is returned
SELECT SUBSTRING_INDEX('hello', ',', 1);  -- 'hello'
-- First part only
SELECT SUBSTRING_INDEX('a.b.c', '.', 1);  -- 'a'
-- NULL propagates
SELECT SUBSTRING_INDEX(CAST(NULL AS VARCHAR), ',', 1);  -- NULL

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →