WHERE

Filters results based on a predicate expression.

Category: query-language

Syntax

WHERE <predicate>

Description

## Overview WHERE constrains the rows produced by a preceding MATCH, OPTIONAL MATCH, or WITH clause. It evaluates a boolean predicate against each candidate binding and discards any row where the predicate evaluates to false or null. WHERE is not a standalone clause; it must follow a clause that establishes variable bindings. In DeltaForge, WHERE predicates are pushed down into the graph traversal engine wherever possible, filtering nodes and edges during CSR adjacency iteration rather than after full pattern expansion. This is critical for performance on large graphs backed by Delta tables. ## Behavior - Predicate pushdown applies to property comparisons, label checks, and constant expressions. The engine evaluates these filters during neighbor iteration to minimize intermediate result size. - When the predicate constrains only the seed variable of the MATCH pattern, the planner may translate the WHERE clause to SQL and execute it via DataFusion against the underlying Parquet files. This path benefits from zone-map pruning and column pushdown, producing a pre-filtered seed set before the graph engine runs. - Pattern predicates (e.g., `WHERE (n)-->()` or `WHERE NOT (n)<--()`) perform an existence check by probing the CSR adjacency list. They do not return the matched pattern; they only confirm or deny its existence. - NULL comparisons follow three-valued logic: any comparison involving NULL yields NULL (not false), which WHERE treats as non-matching. Use IS NULL or IS NOT NULL for explicit null checks. - The IN operator accepts a literal list or a list expression. It is equivalent to multiple OR-ed equality checks. ## Limitations - Regular expression matching (=~) compiles the pattern at query time. Complex regex patterns on large result sets may be slow; consider using STARTS WITH or CONTAINS for prefix and substring checks. - WHERE cannot reference variables introduced in the same MATCH pattern by inline WHERE predicates on nodes/relationships. Use the clause-level WHERE for cross-variable predicates. - SQL-seed pushdown is rejected for function calls (toUpper, toLower, id, labels, type), parameters ($name), arithmetic inside predicates, multi-variable predicates such as `a.x = b.y`, regex match (=~), comprehensions, CASE, REDUCE, and list predicates (ANY, ALL, NONE, SINGLE). Any unsupported fragment causes the whole predicate to evaluate on the graph-engine path instead. Rejection is always safe (never produces wrong results) but loses the zone-map pruning benefit.

Parameters

NameTypeDescription
predicateSpecifies a boolean expression that each candidate row must satisfy. Supports comparison operators (=, <>, <, >, <=, >=), logical connectives (AND, OR, NOT, XOR), null checks (IS NULL, IS NOT NULL), string predicates (STARTS WITH, ENDS WITH, CONTAINS), list membership (IN), regex matching (=~), and pattern predicates for existence checks.

Examples

-- Simple property filter
USE my_zone.my_schema.my_graph
MATCH (n:Employee)
WHERE n.age > 30 AND n.city = 'London'
RETURN n.name, n.age;
-- List membership with IN operator
USE my_zone.my_schema.my_graph
MATCH (n)
WHERE n.active = true AND n.level IN ['L3', 'L4', 'L5']
RETURN n.name AS name, n.level AS level
ORDER BY n.level DESC;
-- Filter on edge properties
USE my_zone.my_schema.my_graph
MATCH (a)-[r]->(b)
WHERE r.weight > 0.8
RETURN a.name AS source, b.name AS target, r.weight AS strength
ORDER BY r.weight DESC;
-- Cross-department filter using property comparison
USE my_zone.my_schema.my_graph
MATCH (a)-[r]->(b)
WHERE a.department <> b.department
RETURN a.name AS src, a.department AS src_dept,
       b.name AS dst, b.department AS dst_dept;
-- Negative pattern predicate: nodes with no outgoing edges
USE my_zone.research_network.research_network
MATCH (n)
WHERE NOT (n)-->()
RETURN n.name AS name, n.department AS department;
-- Combine node inequality with path filter
USE my_zone.my_schema.my_graph
MATCH (a)-[:advisor]->(b)-[:advisor]->(c)
WHERE a <> c
RETURN a.name AS senior, c.name AS junior
ORDER BY a.name;

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →