Compute ArticleRank centrality, a PageRank variant tuned for citation-style graphs
SELECT * FROM graph_articlerank(table_name [, damping] [, max_iterations] [, tolerance])
ArticleRank is a PageRank variant introduced by Li and Willett (2009) that mitigates the bias classic PageRank shows toward nodes with low out-degree. The classic formula divides each source contribution by its out-degree, so a node citing only one neighbour transfers its full rank to that neighbour. In citation graphs this artificially inflates the score of papers cited by sparse-bibliography authors. ArticleRank replaces the divisor with (out_degree[u] + avg_out_degree), where avg_out_degree is the global mean out-degree. The additive term levels the per-edge contribution so the metric tracks genuine importance rather than divisor-shrinkage artefacts. The CPU implementation in delta-forge-graph parallelises a pull-shaped power iteration. For each iteration the dangling mass is summed over zero-out-degree vertices, then every vertex pulls a weighted contribution from its in-neighbours through (out_degree + avg) and the new rank is base_rank + damping * dangling_contrib + damping * incoming. The base term is (1 - damping), not (1 - damping) / N, so the emitted scores are un-normalised, matching the gds.articleRank.stream contract. Convergence is tested with L-infinity on the per-node rank delta and the loop stops as soon as max_iterations is reached or the delta falls below tolerance. Results are returned sorted by score descending. The GPU implementation in delta-forge-graph-gpu is a wgpu port of NVIDIA cuGraph's pagerank_impl.cuh adapted with the ArticleRank additive-base term. Each iteration runs five compute passes: a two-stage dangling reduction, a pre-divide that rewrites ranks in place by (out_degree + avg_out), the pull-shaped iteration that consumes the divided ranks, a max-diff reduction using bitcast plus atomicMax for L-infinity convergence, and a buffer swap. The GPU fast path engages only when the graph has at least 1,000 nodes (thresholds::PAGERANK). Below that threshold the dispatcher returns Ok(None) and the engine runs the CPU implementation. If the graph exceeds GPU memory the buffer upload also returns None and the CPU path takes over silently. A successful GPU run reports execution timings through the algorithm result.
| Name | Type | Description |
|---|---|---|
table_name | Specify the name of the registered Delta table that supplies edges. The loader auto-detects column names src/source/src_id for the source endpoint and dst/target/dst_id for the destination endpoint. An optional weight column, if present, is consumed as an f64 and feeds the weighted out-strength divisor and per-edge contribution. | |
damping | Set the damping factor that scales the contribution from incoming neighbours. Valid range is 0.0 to 1.0. The base rank emitted at every iteration is (1 - damping), so lower values raise the floor for every node and dampen the influence of graph structure. Defaults to 0.85, matching Neo4j GDS gds.articleRank. | |
max_iterations | Cap the number of power-iteration rounds. The driver may exit earlier once L-infinity convergence is reached. Defaults to 20. | |
tolerance | Set the L-infinity convergence threshold on per-node rank delta between successive iterations. Smaller values force more iterations and tighter convergence. Defaults to 1e-7. |
CREATE TABLE citations AS
SELECT * FROM VALUES
(1, 2, 1.0),
(2, 3, 1.0),
(3, 1, 1.0),
(4, 1, 1.0),
(4, 2, 1.0),
(5, 4, 1.0)
AS t(src, dst, weight);
SELECT * FROM graph_articlerank('citations');
SELECT * FROM graph_articlerank('citations', 0.85, 100, 1e-9);
SELECT * FROM graph_articlerank('citations', 0.50);
SELECT node_id, score
FROM graph_articlerank('citations')
ORDER BY score DESC
LIMIT 10;
SELECT a.node_id, m.title, m.year, a.score
FROM graph_articlerank('citations') a
JOIN articles m ON a.node_id = m.id
ORDER BY a.score DESC;