Sales Territory Optimization: SQL + Cypher Interop

Mix Cypher and SQL in a single query: run graph algorithms via the cypher() table function, persist results into Delta, then JOIN with relational order data for a referral-driven recommendation use case.

Category: graph

Syntax

-- Cypher inside SQL: JOIN PageRank scores with the customers Delta table
SELECT c.id, c.name, c.region, c.industry, pr.score AS influence_score
FROM cypher('external.customer_network.customer_network', $$
    CALL algo.pageRank({dampingFactor: 0.85, iterations: 20})
    YIELD node_id, score
    RETURN node_id AS customer_id, score
$$) AS (customer_id BIGINT, score DOUBLE) pr
JOIN external.customer_network.customers c ON pr.customer_id = c.id
ORDER BY pr.score DESC;

-- Persist PageRank results into a Delta table
INSERT INTO external.customer_network.influence_scores
SELECT * FROM cypher('external.customer_network.customer_network', $$
    CALL algo.pageRank({dampingFactor: 0.85, iterations: 20})
    YIELD node_id, score, rank
    RETURN node_id AS customer_id, score AS influence_score, rank AS influence_rank
$$) AS (customer_id BIGINT, influence_score DOUBLE, influence_rank BIGINT);

-- Persist Louvain community assignments
INSERT INTO external.customer_network.community_assignments
SELECT * FROM cypher('external.customer_network.customer_network', $$
    CALL algo.louvain()
    YIELD node_id, community_id
    RETURN node_id AS customer_id, community_id
$$) AS (customer_id BIGINT, community_id BIGINT);

-- SQL JOIN across graph-derived and relational tables
SELECT c.id AS customer_id, c.name, c.region,
       i.influence_score,
       SUM(o.amount)                                  AS total_revenue,
       ROUND(i.influence_score * SUM(o.amount), 2)    AS weighted_influence
FROM external.customer_network.influence_scores i
JOIN external.customer_network.customers c ON i.customer_id = c.id
JOIN external.customer_network.orders    o ON c.id        = o.customer_id
GROUP BY c.id, c.name, c.region, i.influence_score
ORDER BY weighted_influence DESC;

-- Full 4-way territory planning JOIN
SELECT sr.rep_id, sr.rep_name, sr.territory, sr.quota,
       COUNT(DISTINCT c.id)                         AS customer_count,
       ROUND(SUM(i.influence_score), 4)             AS total_influence,
       COUNT(DISTINCT ca.community_id)              AS communities_covered
FROM external.customer_network.sales_reps sr
JOIN external.customer_network.customers c              ON sr.territory = c.region
JOIN external.customer_network.influence_scores i       ON c.id = i.customer_id
JOIN external.customer_network.community_assignments ca ON c.id = ca.customer_id
GROUP BY sr.rep_id, sr.rep_name, sr.territory, sr.quota
ORDER BY total_influence DESC;

-- Mixed CTE: Cypher CTE (degree) + SQL CTE (revenue) in one query
WITH hub_scores AS (
    SELECT * FROM cypher('external.customer_network.customer_network', $$
        CALL algo.degree()
        YIELD node_id, total_degree
        RETURN node_id AS customer_id, total_degree
    $$) AS (customer_id BIGINT, total_degree BIGINT)
),
revenue AS (
    SELECT customer_id, SUM(amount) AS total_revenue
    FROM external.customer_network.orders
    GROUP BY customer_id
)
SELECT c.id, c.name, c.region, h.total_degree, r.total_revenue
FROM hub_scores h
JOIN external.customer_network.customers c ON h.customer_id = c.id
JOIN revenue r                              ON c.id          = r.customer_id
ORDER BY h.total_degree DESC, r.total_revenue DESC;

Description

## When to Use When your analytical workload spans both graph and relational data, customer referral networks with order history, research collaborations with citation counts, product co-purchase graphs with revenue, and you want a single query that answers "who is the most influential high-revenue customer?" without shipping graph results out to another engine. This B2B CRM dataset (40 enterprise customers, 96 referral edges, 120 orders, 8 sales reps) is the canonical interop showcase: Cypher algorithms feed Delta tables, SQL joins combine them with revenue, and mixed CTEs do both in one statement. ## What You Will Learn 1. `cypher('<graph>', $$ ... $$) AS (col1 TYPE, col2 TYPE)`, embed any Cypher query inside a FROM clause as a virtual relational table. The `$$`-delimited string is parsed and executed against the named graph, the column list declares the output schema. 2. `INSERT INTO <delta_table> SELECT * FROM cypher(...)`, persist PageRank scores, Louvain community assignments, or any algorithm output into a Delta table for downstream SQL queries and dashboards. 3. SQL JOINs on Cypher-populated tables: once influence and community tables are materialized, standard JOINs combine them with customers, orders, and sales_reps, no special graph-SQL bridge needed. 4. Mixed CTEs: `WITH hub_scores AS (SELECT * FROM cypher(...) ...), revenue AS (SELECT ... GROUP BY customer_id) SELECT ... JOIN ...`, a single query that does graph degree centrality *and* revenue aggregation, then joins both. 5. Territory/cohort recommendation pattern: `influence_score * total_revenue` gives a weighted score that ranks customers who are both structurally important in the referral network AND high-revenue. ## Prerequisites - Familiarity with basic Cypher (MATCH / CALL algo.*) and Delta DML. - Understanding of the `cypher()` table function's column-list declaration, column count, names, and types must match the Cypher RETURN clause exactly.

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →