Execution Hints

Control execution device (CPU/GPU) and property loading mode (streaming/eager).

Category: query-language

Syntax

USE <graph_name>
ON STREAMING [CACHE <n>] [BATCH <n>]
ON EAGER
ON AUTO
ON GPU [THRESHOLD <n>] [STREAMING [CACHE <n>] [BATCH <n>] | EAGER | AUTO]
-- per-clause override (after the clause body):
MATCH <pattern> ON GPU [THRESHOLD <n>] | ON CPU
CALL <proc>(<args>) ON GPU [THRESHOLD <n>] | ON CPU

Description

## Overview Execution hints provide fine-grained control over how DeltaForge executes Cypher queries. The USE clause selects the target graph definition (mandatory for all Cypher queries), while ON clauses tune the execution device and property loading strategy. These hints enable optimization for different graph sizes and workload characteristics without changing the query logic. DeltaForge supports three property loading modes (EAGER, STREAMING, AUTO) and two execution devices (CPU, GPU). The defaults (EAGER + CPU) are optimized for small to medium graphs that fit comfortably in memory. For large graphs with millions of nodes, STREAMING mode loads properties on-demand with LRU caching, and GPU mode accelerates supported algorithms and MATCH expansion via wgpu compute shaders that run on any DirectX 12, Vulkan, or Metal GPU. ## Behavior - USE is required and must appear before any other clause. It resolves the graph name against the graph definition registry, which maps to the underlying Delta tables for nodes and edges. - EAGER mode (default) loads all node and edge properties into memory at query start. This provides the fastest property access but requires sufficient RAM for the entire graph. - STREAMING mode uses an LRU cache with configurable cache size and batch size. Properties are fetched from Delta tables on demand. This is suitable for graphs too large to fit entirely in memory. - AUTO mode inspects the graph size at query time and selects EAGER or STREAMING automatically based on an internal threshold. - ON GPU at the query level (placed after USE) enables GPU execution for every supported operator in the query. ON GPU can also be written as a per-clause suffix after a MATCH pattern or after a CALL argument list (e.g. CALL algo.pageRank(...) ON GPU THRESHOLD 100000); the per-clause suffix overrides the query-level setting for that single clause and takes precedence over it. The per-CALL suffix is the recommended form for algorithm calls because it keeps the device hint next to the algorithm it controls rather than detached at the top of the statement. - ON GPU activates GPU acceleration for single-hop MATCH expansion, PageRank, betweenness centrality, connected components, Louvain community detection, and triangle count. Multi-hop MATCH patterns reuse the same GPU expansion kernel hop by hop. - If the graph node count is below THRESHOLD, the GPU accelerator is not registered, or GPU hardware is unavailable, the engine silently falls back to CPU. The fallback is per-operator: a query may run PageRank on GPU and the following MATCH on CPU if thresholds differ. - Execution hints are composable: ON GPU and ON STREAMING can be combined as ON GPU STREAMING. Property loading always runs on CPU regardless of the execution device. ## Limitations - GPU acceleration covers the five graph algorithms listed above plus single-hop MATCH expansion. Other clauses (WHERE, RETURN, WITH projection, ORDER BY, aggregation, UNION) always execute on CPU. Property materialization after a GPU MATCH also runs on CPU. - STREAMING mode adds per-property-access overhead due to cache lookups. For small graphs, EAGER mode is always faster. - The CACHE and BATCH parameters for STREAMING mode apply to node properties only. Edge properties use a separate fixed-size cache. - THRESHOLD is compared against the graph node count, not the number of matched source nodes. A small MATCH over a large graph still triggers the GPU path if the graph itself exceeds the threshold.

Parameters

Name	Type	Description
`graph_name`		Specifies the graph definition to query against, using dot-qualified naming (zone.schema.graph). The graph definition maps to underlying Delta tables for nodes and edges. Required for all Cypher queries to establish the graph context.
`device`		Specifies the execution device. Valid values: CPU (default) or GPU. GPU execution is opt-in via ON GPU syntax. If the graph is below the minimum threshold or GPU hardware is unavailable, the engine silently falls back to CPU.
`property_mode`		Specifies the property loading strategy. Valid values: EAGER (default, loads all properties into memory upfront), STREAMING (loads properties on-demand with LRU caching), or AUTO (automatically selects based on graph size). STREAMING accepts optional CACHE (default 100,000 nodes) and BATCH (default 1,000 nodes) parameters.

Examples

-- Basic graph selection with USE
USE my_zone.my_schema.my_graph
MATCH (n)
RETURN n.name AS name;

-- Streaming mode for large graphs
USE my_zone.large_schema.large_graph
ON STREAMING CACHE 200000 BATCH 5000
MATCH (n)
RETURN n.name AS name, n.value AS value
ORDER BY n.value DESC
LIMIT 100;

-- GPU execution for compute-intensive algorithm
USE my_zone.my_schema.my_graph
ON GPU THRESHOLD 100000
CALL algo.pageRank({dampingFactor: 0.85, iterations: 50})
YIELD node_id, score
RETURN node_id, score
ORDER BY score DESC;

-- GPU-accelerated MATCH expansion (per-MATCH suffix)
USE my_zone.my_schema.huge_graph
MATCH (a)-[r]->(b) ON GPU THRESHOLD 100000
RETURN a.id AS src_id, b.id AS dst_id, r.weight;

-- Per-CALL GPU suffix: hint sits next to the algorithm it controls
USE my_zone.my_schema.huge_graph
CALL algo.pageRank({dampingFactor: 0.85, iterations: 50}) ON GPU THRESHOLD 100000
YIELD node_id, score
RETURN node_id, score
ORDER BY score DESC;

-- Auto property mode: let the engine decide
USE my_zone.my_schema.my_graph
ON AUTO
MATCH (a)-[r]->(b)
RETURN a.name AS source, b.name AS target, r.weight AS weight;

Pitfalls

Omitting the USE clause produces an error because no graph context is established. Every Cypher query must begin with USE <zone>.<schema>.<graph>.
EAGER mode on graphs with millions of nodes can cause out-of-memory errors. Switch to STREAMING or AUTO mode for large graphs.
ON GPU with a low THRESHOLD may route small graphs to the GPU, where the overhead of GPU memory transfer exceeds the computation benefit. Set the threshold to at least 10,000 nodes for typical workloads.
STREAMING mode with a very small CACHE causes frequent cache evictions and repeated Delta table reads, degrading performance. Size the cache to hold at least the working set of the traversal pattern.
GPU MATCH expansion returns topology efficiently, but returning bare variables (RETURN a, r, b) forces full CPU-side property propagation, which often dominates runtime. Project only the properties you need, or materialize topology into a temp table and JOIN properties back via SQL.