Aggregate a column of values into a single array.
ARRAY_AGG(expr)
## Overview Collects all values of an expression across the rows of a group into a single array value. ARRAY_AGG is the inverse of UNNEST and is the standard way to roll a child relation up into a parent row, letting downstream consumers materialise denormalised facts, build tag lists, capture ordered event trails, or ship nested payloads to JSON and Parquet sinks. Typical uses include collecting order line descriptions per order, building per-user event histories, merging child dimension attributes into a single row, and preparing nested arrays for downstream clients that expect embedded collections rather than a join result. ## Behavior - Aggregates values across the rows of the current group (or the entire result set when no GROUP BY is present) into an ARRAY whose element type matches the argument. - NULL inputs are preserved as NULL elements in the result array; the function does not silently drop them. - If the input group contains no rows, the function returns NULL rather than an empty array. LEFT JOIN callers should wrap the result in COALESCE(ARRAY_AGG(x), ARRAY()) when an empty array is desired. - Element ordering follows the order in which rows are delivered to the aggregate. That order is not guaranteed to match any source ordering, so apply an ORDER BY on an inner subquery when determinism is required. - The return type is ARRAY<T> where T is the type of the argument. The argument can itself be an ARRAY, STRUCT, or MAP, producing nested collections. - Works with DISTINCT (ARRAY_AGG(DISTINCT expr)) to deduplicate elements before collection. ## Null and empty handling - When the input column is NULL on a given row, the NULL is included as an element in the output array. - When the group is empty (no rows match the group), ARRAY_AGG returns NULL. Use COALESCE(ARRAY_AGG(expr), ARRAY()) to produce an empty array instead. - If every row in the group produces NULL, the result is an array of NULLs, not NULL itself. ## Compatibility - Matches the standard SQL ARRAY_AGG aggregate. COLLECT_LIST is an alias-style equivalent in many array/map SQL dialects. - Works anywhere an ordinary aggregate is allowed, including GROUP BY queries, window frames that support array-valued results, and HAVING clauses that operate on the aggregated array.
| Name | Type | Description |
|---|---|---|
expr | Specifies the expression whose values are collected across the grouped rows into the resulting array. Accepts any scalar, struct, or nested array expression. NULL values are retained in the output. |
-- Basic aggregation of literal rows
SELECT ARRAY_AGG(x) AS xs FROM (VALUES (1), (2), (3)) AS t(x); -- [1, 2, 3]
-- Collect all event types per user session
SELECT session_id, ARRAY_AGG(event_type) AS event_types
FROM analytics.events.user_sessions
GROUP BY session_id;
-- Aggregation over an empty group via LEFT JOIN
SELECT u.user_id, ARRAY_AGG(e.event_type) AS events
FROM analytics.events.users u
LEFT JOIN analytics.events.user_sessions e ON e.user_id = u.user_id
GROUP BY u.user_id;
-- NULL values are preserved in the output array
SELECT ARRAY_AGG(x) AS xs FROM (VALUES (1), (CAST(NULL AS INT)), (3)) AS t(x); -- [1, NULL, 3]
-- Force deterministic ordering by sorting the inner set first
SELECT ARRAY_AGG(x) AS xs
FROM (SELECT x FROM (VALUES (3), (1), (2)) AS t(x) ORDER BY x) sub; -- [1, 2, 3]
-- Build a materialised session fact row from detail events
SELECT session_id,
ARRAY_AGG(page_url) AS pages_visited,
COUNT(*) AS page_views
FROM analytics.events.user_sessions
GROUP BY session_id;