Karate Club: Graph Basics, Ingestion, and Verified Algorithms

Load Zachary's Karate Club into Delta tables, build an undirected graph, and run Cypher patterns plus 15 algorithms with NetworkX-verified golden values.

Category: graph

Syntax

-- Ingestion
CREATE ZONE IF NOT EXISTS external TYPE EXTERNAL;
CREATE SCHEMA IF NOT EXISTS external.karate_club;

CREATE EXTERNAL TABLE external.karate_club_raw.karate_edges
USING CSV LOCATION '{{data_path}}/edges.csv'
OPTIONS (header = 'true', delimiter = '|');

CREATE DELTA TABLE external.karate_club.edges
LOCATION '{{data_path}}/delta/edges' AS
SELECT CAST(src AS BIGINT) AS src, CAST(dst AS BIGINT) AS dst,
       CAST(weight AS DOUBLE) AS weight, CAST(edge_type AS VARCHAR) AS edge_type
FROM external.karate_club_raw.karate_edges;

CREATE DELTA TABLE external.karate_club.vertices
LOCATION '{{data_path}}/delta/vertices' AS
SELECT CAST(vertex_id AS BIGINT) AS vertex_id,
       CAST(name AS VARCHAR) AS name,
       CAST(category AS VARCHAR) AS role
FROM external.karate_club_raw.karate_vertices;

OPTIMIZE external.karate_club.edges ZORDER BY (src, dst);

-- Graph definition (UNDIRECTED, engine materializes reverse edges at CSR build)
CREATE GRAPH external.karate_club.karate_club
  VERTEX TABLE external.karate_club.vertices
    ID COLUMN vertex_id NODE TYPE COLUMN role NODE NAME COLUMN name
  EDGE TABLE external.karate_club.edges
    SOURCE COLUMN src TARGET COLUMN dst
    WEIGHT COLUMN weight EDGE TYPE COLUMN edge_type
  UNDIRECTED;

CREATE GRAPHCSR external.karate_club.karate_club;

-- Basic Cypher
USE external.karate_club.karate_club
MATCH (v) RETURN v.id AS member_id, v.name AS name, v.role AS role ORDER BY member_id;

USE external.karate_club.karate_club
MATCH (a)-[r]->(b) WHERE a.id = 0
RETURN b.id AS friend_id, b.name AS friend_name ORDER BY friend_id;

-- Variable-length path: 2-hop reachability from node 0
USE external.karate_club.karate_club
MATCH (a)-[*1..2]->(b) WHERE a.id = 0
RETURN COUNT(DISTINCT b.id) AS reachable_in_2_hops;

-- Algorithms with golden-value asserts
ASSERT VALUE rank = 1 WHERE node_id = 33
ASSERT VALUE rank = 2 WHERE node_id = 0
USE external.karate_club.karate_club
CALL algo.pageRank({dampingFactor: 0.85, iterations: 20})
YIELD node_id, score, rank
RETURN node_id, score, rank ORDER BY score DESC LIMIT 10;

USE external.karate_club.karate_club
CALL algo.betweenness() YIELD node_id, centrality, rank
RETURN node_id, centrality, rank ORDER BY centrality DESC LIMIT 10;

-- Louvain is non-deterministic: use ASSERT WARNING for bounds
ASSERT WARNING ROW_COUNT >= 3
ASSERT WARNING ROW_COUNT <= 10
USE external.karate_club.karate_club
CALL algo.louvain({resolution: 1.0}) YIELD node_id, community_id
RETURN community_id, count(*) AS members ORDER BY members DESC;

-- Shortest path between the two faction leaders
USE external.karate_club.karate_club
CALL algo.shortestPath({source: 0, target: 33})
YIELD node_id, step, distance
RETURN node_id, step, distance ORDER BY step;

Description

## When to Use Start here when you want a minimal end-to-end graph workflow against a canonical dataset: ingest vertices and edges from CSV into Delta tables, declare a named graph, warm the CSR cache, and run your first Cypher MATCH / CALL algo.* queries. Because Zachary's Karate Club (34 nodes, 78 undirected edges) has been studied for nearly 50 years, every algorithm in this demo is asserted against published NetworkX reference values, so you can trust that your install is producing correct results before moving to larger graphs. ## What You Will Learn 1. Load vertex and edge CSVs via CREATE EXTERNAL TABLE + CTAS into Delta tables with typed columns. 2. Declare a named graph with `CREATE GRAPH ... VERTEX TABLE ... EDGE TABLE ... UNDIRECTED` and warm the topology with `CREATE GRAPHCSR`. 3. Execute basic Cypher: `MATCH (n) RETURN ...`, `MATCH (a)-[r]->(b)`, variable-length paths `[*1..2]`, and degree counts. 4. Call DeltaForge's graph algorithm library: `pageRank`, `degree`, `betweenness`, `closeness`, `louvain`, `connectedComponents`, `scc`, `shortestPath`, `allShortestPaths`, `bfs`, `dfs`, `mst`, `triangleCount`, `knn`, `similarity`. 5. Use `ASSERT VALUE ... WHERE ...` to pin algorithm outputs to published golden values and `ASSERT WARNING` for non-deterministic results (Louvain, MST on equal weights). ## Prerequisites - DeltaForge engine with graph + Cypher support enabled. - Write access to `{{data_path}}` for Delta table storage. - The demo ships edges.csv (78 canonical rows, src<dst, weight=1.0) and vertices.csv (34 members with role).

Pitfalls

See Also

Open in interactive docs →   DeltaForge home →