Count the number of triangles each node participates in
SELECT * FROM graph_triangle_count(table_name)
Triangle count computes the number of triangles that each node participates in. A triangle is a cycle of length three: three nodes, each connected to the other two. The per-node triangle count is a local measure of clustering density and is the building block for computing the clustering coefficient, which quantifies how tightly connected a node's neighborhood is. High triangle counts indicate nodes embedded in cohesive subgroups, which is useful for fraud detection, social cohesion analysis, and network robustness assessment. The function loads edge data from a registered Delta table as an undirected graph in Compressed Sparse Row (CSR) format. The algorithm iterates over each edge and counts the number of common neighbors between the two endpoints, summing contributions to each involved node. Column names for source and target are auto-detected from standard conventions (src/source/src_id, dst/target/dst_id). The time complexity is O(E * sqrt(E)) when using a degree-ordered intersection approach, or O(sum of min-degree products) in practice. For sparse graphs this is near-linear, but for dense graphs with high-degree nodes it can be quadratic. The total number of unique triangles in the graph equals the sum of all per-node triangle counts divided by 3. GPU acceleration is available for triangle counting. When a GPU accelerator is registered, the engine attempts parallel edge-intersection on the GPU and falls back to CPU if GPU resources are unavailable. GPU execution provides significant speedup for dense graphs with many high-degree nodes. The graph cache (256 MB default budget, LRU eviction, 10-minute idle timeout) retains the CSR topology for reuse across calls.
| Name | Type | Description |
|---|---|---|
table_name | Specify the name of the registered Delta table containing edge data. The table must include source and target columns (auto-detected as src/source/src_id and dst/target/dst_id). The graph is loaded as undirected for triangle enumeration. Edge weights are ignored. |
CREATE TABLE social_network AS
SELECT * FROM VALUES
(1, 2, 1.0),
(2, 3, 1.0),
(3, 1, 1.0),
(3, 4, 1.0),
(4, 5, 1.0),
(5, 3, 1.0)
AS t(src, dst, weight);
SELECT * FROM graph_triangle_count('social_network');
SELECT SUM(triangle_count) / 3 AS total_triangles
FROM graph_triangle_count('social_network');
SELECT
t.node_id,
t.triangle_count,
d.total_degree,
CASE
WHEN d.total_degree < 2 THEN 0.0
ELSE (2.0 * t.triangle_count) / (d.total_degree * (d.total_degree - 1))
END AS clustering_coefficient
FROM graph_triangle_count('social_network') t
JOIN graph_degree('social_network') d
ON t.node_id = d.node_id
ORDER BY clustering_coefficient DESC;
SELECT node_id, triangle_count
FROM graph_triangle_count('social_network')
WHERE triangle_count > 0
ORDER BY triangle_count DESC;