ARRAY_DISTINCT

Return a new array with duplicate elements removed.

Category: collectionReturns: ARRAY<T>Dialect: Standard

Syntax

ARRAY_DISTINCT(array)

Description

## Overview Returns a new array containing only the distinct elements of the input array. ARRAY_DISTINCT is the array-level equivalent of the scalar DISTINCT clause and is the simplest way to turn a bag-valued array into a set-valued array while keeping the original first-occurrence ordering where possible. Typical uses include removing duplicate tags from a per-row tag list, deduplicating the output of ARRAY_AGG before unnesting, and normalising array payloads for downstream set-membership comparisons. ## Behavior - Produces an ARRAY<T> whose length is less than or equal to the input length, with each value appearing at most once. - Preserves the order of first occurrence for each distinct value. The second occurrence and later are dropped. - Equality follows SQL comparison rules, including structural equality for STRUCT, MAP, and nested ARRAY elements. - Multiple NULL elements collapse to a single NULL element in the result. A standalone NULL is considered equal to another NULL for the purposes of this function. - Works on any element type; no ORDER BY is implied. - Allocates a new array; the input is not modified. ## Null and empty handling - NULL input array returns NULL. - Empty input array returns an empty array of the same element type. - Duplicate NULL elements collapse to a single NULL element in the output. - Array of a single NULL returns an array with a single NULL. ## Compatibility - Matches the SQL array/map convention for ARRAY_DISTINCT. First-occurrence ordering is guaranteed by this engine; portable code that relies on ordering should document the assumption or wrap with ARRAY_SORT. - Equivalent in outcome to ARRAY_UNION(array, ARRAY()).

Parameters

NameTypeDescription
arraySpecifies the input array from which duplicate elements are removed. The result preserves the order of first occurrence for each distinct value.

Examples

-- Collapse duplicates in a numeric array
SELECT ARRAY_DISTINCT(ARRAY[1, 2, 2, 3, 3, 3]);  -- [1, 2, 3]
-- Deduplicate per-session tags
SELECT session_id, ARRAY_DISTINCT(tags) AS unique_tags
FROM analytics.events.user_sessions;
-- Multiple NULLs collapse to a single NULL
SELECT ARRAY_DISTINCT(ARRAY[1, CAST(NULL AS INT), 2, CAST(NULL AS INT)]);  -- [1, NULL, 2]
-- Empty array stays empty, NULL array stays NULL
SELECT ARRAY_DISTINCT(CAST(ARRAY() AS ARRAY<INT>)) AS empty_arr,
       ARRAY_DISTINCT(CAST(NULL AS ARRAY<INT>)) AS null_arr;
-- Combine with ARRAY_AGG to get a set view of a group
SELECT user_id, ARRAY_DISTINCT(ARRAY_AGG(event_type)) AS distinct_events
FROM analytics.events.user_sessions
GROUP BY user_id;
-- Compare counts to detect duplicate tagging
SELECT session_id
FROM analytics.events.user_sessions
WHERE ARRAY_SIZE(tags) <> ARRAY_SIZE(ARRAY_DISTINCT(tags));

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →