Enumerate the K shortest loopless paths between two nodes in increasing weight order
SELECT * FROM graph_yens(table_name, source_id, target_id [, k])
Yen's algorithm enumerates the K shortest loopless paths between a source and a target in non-decreasing order of total weight. The first row is always the globally shortest path; each subsequent row is the cheapest path that differs from every preceding path by at least one edge. Use this when a single shortest-path answer is not enough: route diversification, alternative-route planning, backup-link analysis, top-K explanation queries, and any scenario that asks for the second-best, third-best, or kth-best traversal. The function loads edge data from the registered Delta table into a Compressed Sparse Row (CSR) representation. Source, target, and (optional) weight columns are auto-detected from standard column-name conventions. A_1 is computed by a Dijkstra search from source to target. For each subsequent A_k, the algorithm walks every spur position along A_{k-1}, runs a Dijkstra subproblem from the spur node with the previously committed prefix excluded as visited nodes and with the next-edge of every previously committed path sharing the same prefix excluded as a forbidden transition, and pushes the resulting full path into a candidate heap. The cheapest non-duplicate candidate is then promoted to A_k. The process terminates when k paths have been emitted or the candidate heap is exhausted, whichever comes first. Time complexity is O(K * V * (E + V log V)) in the worst case: K outer iterations, V spur positions per iteration, and a full Dijkstra inside each spur. The implementation matches Neo4j GDS `gds.shortestPath.yens` semantics for ties: when multiple candidate paths share the same total cost, the one whose CSR-index sequence is lexicographically smaller at the first divergence wins, which makes the output deterministic across repeated runs on the same input. GPU acceleration is not available for the SQL surface of this function. Yen's algorithm is structurally serial along the outer K loop because A_k depends on the entire set {A_1, ..., A_{k-1}} to build per-spur exclusion masks. The spur subproblems themselves are parallelisable, but they are bounded by V and individually cheap; the candidate heap is small (at most k * (V - 1) entries) and is maintained on the host. The graph cache (256 MB default budget, LRU eviction, 10-minute idle timeout) retains the CSR topology so repeated queries with different source/target/k arguments do not reload the edge table.
| Name | Type | Description |
|---|---|---|
table_name | Specify the name of the registered Delta table containing edge data. The table must include source and target columns (auto-detected as src/source/src_id and dst/target/dst_id). If a weight column is present, it is used for path cost; otherwise every edge contributes 1.0 to the total cost. Edge direction is respected; symmetrise the input table if undirected semantics are required. | |
source_id | Specify the starting node ID for the path enumeration. The value must match an ID present in the edge table. If the node does not exist or the table is empty, an empty result set is returned. | |
target_id | Specify the destination node ID. The value must match an ID present in the edge table. If source == target the function returns a single rank-1 row with cost 0 and a trivial single-node path. If no path connects source to target the result is empty. | |
k | Set the maximum number of distinct shortest paths to return. Defaults to 3 when omitted. Must be a positive integer. The function returns at most k rows. When the graph contains fewer than k loopless paths between source and target, the result set is shorter than k without warning. Practical values are typically 1 to 16; the algorithmic cost grows linearly with k. |
CREATE TABLE grid_edges AS
SELECT * FROM VALUES
(1, 2, 1.0), (2, 3, 1.0), (3, 4, 1.0),
(1, 5, 1.0), (5, 6, 1.0), (6, 4, 1.0),
(1, 7, 2.0), (7, 8, 1.0), (8, 4, 2.0),
(2, 5, 3.0), (3, 6, 3.0)
AS t(src, dst, weight);
SELECT rank, total_cost, nodes
FROM graph_yens('grid_edges', 1, 4, 3)
ORDER BY rank;
SELECT * FROM graph_yens('grid_edges', 1, 4);
SELECT total_cost AS backup_cost, nodes AS backup_route
FROM graph_yens('grid_edges', 1, 4, 2)
WHERE rank = 2;
WITH paths AS (
SELECT rank, total_cost
FROM graph_yens('grid_edges', 1, 4, 5)
)
SELECT
MIN(total_cost) AS best,
MAX(total_cost) AS kth_worst,
MAX(total_cost) - MIN(total_cost) AS spread
FROM paths;
SELECT
CASE WHEN COUNT(*) < 2 THEN 'fragile' ELSE 'redundant' END AS resilience
FROM graph_yens('grid_edges', 1, 4, 2);