skills/neo4j-contrib/neo4j-skills/neo4j-cypher-skill

neo4j-cypher-skill

Installation
SKILL.md

When to Use

  • Writing, optimizing, or debugging Cypher queries
  • Graph pattern matching, QPEs, variable-length paths
  • Vector/fulltext search, subqueries, batch writes, LOAD CSV

When NOT to Use

  • Driver migration/API changesneo4j-migration-skill
  • DB admin (users, config, backups) → neo4j-cli-tools-skill

GQL conformance note: LET, FINISH, FILTER, and INSERT are valid Cypher 25 clauses (introduced via GQL conformance, mostly in Neo4j 2025.06). On older versions, fall back to WITH / (omit RETURN) / WHERE / CREATE. INSERT requires &-separated multi-labels and does not support dynamic labels/types.


Pre-flight

? Known Unknown
Schema Use directly Run Schema-First Protocol
Neo4j version Use version features Default to 2025.01 safe set
Executing (not generating)? Use EXPLAIN + write gate State query is unvalidated

Schema unknown + no tool → produce non-executable sketch outside a code block:

(<SOURCE_LABEL> {<KEY>: $value})-[:<REL_TYPE>]->(<TARGET_LABEL>)

Never fill guessed names — realistic guesses get copied blindly.


Defaults — apply every query

  1. CYPHER 25 — first token; never repeat after UNION or inside subqueries
  2. Schema first — inspect before writing; if schema in prompt, use it directly
  3. MERGE on constrained key only; rel MERGE on already-bound endpoints only
  4. Label-free MATCH (n) forbidden unless bound or followed by WHERE n:$($label)
  5. LIMIT 25 default on all exploratory reads; push WITH n LIMIT before high-cardinality operations (variable-length traversals, fan-out MATCH, Cartesian products)
  6. Comments: // only — -- is SQL, invalid
  7. REPEATABLE ELEMENTS / DIFFERENT RELATIONSHIPS go after MATCH, not end of pattern
  8. SHOW commands: YIELD before WHERE; no UNION
  9. Inline node predicates (:Label WHERE p=x) — valid in MATCH only
  10. WHERE cannot follow bare UNWIND — use WITH x WHERE
  11. (a)-[:R]-(b) — undirected matches both directions, double-counts; use directed unless unknown
  12. DETACH DELETE — plain DELETE throws if node has relationships

Style

Element Convention
Node labels PascalCase :Person
Rel types SCREAMING_SNAKE_CASE :KNOWS
Properties/vars camelCase firstName
Clauses UPPERCASE MATCH
Booleans/null lowercase true false null
Strings single-quoted; double only if contains '

Schema is truth. :Person, :KNOWS, name in examples are illustrative — substitute real names from schema.


Schema-First Protocol

Schema in context → use it, skip inspection.

Schema missing → run:

CALL db.schema.visualization() YIELD nodes, relationships RETURN nodes, relationships;
SHOW INDEXES YIELD name, type, labelsOrTypes, properties, state WHERE state = 'ONLINE';
SHOW CONSTRAINTS YIELD name, type, labelsOrTypes, properties;
SHOW PROCEDURES YIELD name RETURN split(name,'.')[0] AS namespace, count(*) AS procedures;

Property types per label — check APOC first:

// If APOC available (preferred — use this):
CALL apoc.meta.schema() YIELD value RETURN value;

// No APOC AND database ≤ 100k nodes/rels only (expensive on large graphs):
CALL db.schema.nodeTypeProperties() YIELD nodeType, propertyName, propertyTypes, mandatory;
CALL db.schema.relTypeProperties() YIELD relType, propertyName, propertyTypes, mandatory;

Validate before returning any query: label exists · rel type+direction correct · property on that label · index ONLINE.


Key Patterns

MERGE

// MERGE on constrained key; set extras in ON CREATE/ON MATCH
CYPHER 25
MATCH (a:Person {id: $a}) MATCH (b:Person {id: $b})
MERGE (a)-[r:KNOWS]->(b)
  ON CREATE SET r.since = date()
  ON MATCH  SET r.lastSeen = date()

SET n = {} replaces all props. SET n += {} merges (safe partial update). Use += for updates.

WITH scope

CYPHER 25
MATCH (a:Person)-[:KNOWS]->(b:Person)
WITH a, count(*) AS friends   // b dropped here
WHERE friends > 5
RETURN a.name, friends ORDER BY friends DESC

Every var not listed in WITH is dropped. WITH * carries all forward.

Subqueries — cheat sheet

EXISTS { (a)-[:R]->(b) }                           // boolean check
COUNT  { (a)-[:R]->(b) WHERE a.x > 0 }             // count
COLLECT { MATCH (a)-[:R]->(b) RETURN b.name }       // collect list (full MATCH+RETURN required)
CALL (p) { MATCH (p)-[:ACTED_IN]->(m) RETURN m }    // correlated subquery (explicit import)
OPTIONAL CALL (p) { ... }                           // nullable subquery

CALL { WITH x ... } deprecated → CALL (x) { ... }. COLLECT {} returns exactly one column.

CALL IN TRANSACTIONS (bulk writes)

CYPHER 25
LOAD CSV WITH HEADERS FROM 'file:///data.csv' AS row
CALL (row) {
  MERGE (p:Person {id: row.id}) SET p += row
} IN TRANSACTIONS OF 1000 ROWS ON ERROR CONTINUE REPORT STATUS AS s

Input stream must be outside subquery. Auto-commit only — never wrap in beginTransaction(). PERIODIC COMMIT deprecated.

QPE basics

CYPHER 25
MATCH SHORTEST 1 (a:Person {name:'Alice'})(()-[:KNOWS]->()){1,}(b:Person {name:'Bob'})
RETURN b.name

// ACYCLIC [2026.03] — no repeated nodes within a path (prevents cycles)
CYPHER 25
MATCH p = ACYCLIC (start:Router {name: $from})-[:LINK]-+(end:Router {name: $to})
RETURN [n IN nodes(p) | n.name] AS route
ORDER BY length(p) LIMIT 5

Quantifier outside group: (pattern){N,M}. Groups start+end with node. REPEATABLE ELEMENTS needs bounded {m,n}. ACYCLIC implies nodes cannot repeat within a path (stronger than default DIFFERENT RELATIONSHIPS).

Match mode — add after MATCH:

  • DIFFERENT RELATIONSHIPS (default) — each rel traversed once per path
  • REPEATABLE ELEMENTS [2025.x] — nodes/rels revisitable; use for circular routes, weight-optimized paths, constrained backtracking; requires bounded {m,n}

Conditional CALL subqueries [2025.06]

CYPHER 25
MATCH (move:Item {id: $id})
OPTIONAL MATCH (insertBefore:Item {id: $before})
OPTIONAL MATCH (insertAfter:Item  {id: $after})
CALL (move, insertBefore, insertAfter) {
  WHEN insertBefore IS NULL THEN {
    MATCH (last:Item) WHERE NOT (last)-[:NEXT]->() AND last <> move
    CREATE (last)-[:NEXT]->(move)
  }
  WHEN insertAfter IS NULL THEN {
    CREATE (move)-[:NEXT]->(insertBefore)
  }
  ELSE {
    CREATE (insertAfter)-[:NEXT]->(move)
    CREATE (move)-[:NEXT]->(insertBefore)
  }
}

Use WHEN…THEN…ELSE for if-else-if write logic; mutually exclusive (first match wins). Not available pre-2025.06.

Dynamic relationship types [2025.x]

// Create/match/merge with dynamic rel type (must resolve to exactly one STRING)
CYPHER 25 CREATE (a:Node)-[:$($relType)]->(b:Node)
CYPHER 25 MATCH  (a:Node)-[:$($relType)]->(b:Node) RETURN a.name, b.name

Spatial / Point

// WGS84 geographic point
SET n.coords = point({longitude: $lon, latitude: $lat})

// Distance in metres; requires POINT index for performance
MATCH (a:Place {name: $origin}) MATCH (b:Place)
RETURN b.name, point.distance(a.coords, b.coords) AS distM
ORDER BY distM LIMIT 10

// Bounding-box pre-filter (uses POINT index) then distance
MATCH (b:Place)
WHERE point.withinBBox(b.coords,
        point({longitude: $west, latitude: $south}),
        point({longitude: $east, latitude: $north}))
RETURN b.name, point.distance(b.coords, $origin) AS distM

Create POINT index: CREATE POINT INDEX name IF NOT EXISTS FOR (n:Place) ON (n.coords)

Aggregation grouping keys

Non-aggregating expressions in RETURN/WITH are implicit grouping keys — no GROUP BY needed:

// actor + director are grouping keys; count(*) is the aggregate
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
RETURN a.name, d.name, count(*) AS collaborations
ORDER BY collaborations DESC

count(n) counts non-null; count(*) counts rows including nulls. collect(DISTINCT expr) deduplicates. count() is faster than size(collect()) — count() reads the internal store; collect() builds a list first.


Common Syntax Traps (top causes of broken queries)

Wrong Right
ORDER BY n.prop AS x DESC ORDER BY n.prop DESC
ORDER BY preAggVar after agg RETURN Use RETURN alias
count(r WHERE r.x=5) sum(CASE WHEN r.x=5 THEN 1 ELSE 0 END)
UNWIND list AS x WHERE x>5 UNWIND list AS x WITH x WHERE x>5
least(a,b) / greatest(a,b) CASE WHEN a<b THEN a ELSE b END
-- comment // comment
shortestPath((a)-[*]->(b)) SHORTEST 1 (a)(()-[]->()){1,}(b)
id(n) elementId(n)
[:REL*1..5] (()-[:REL]->()){1,5}
CALL { WITH x ... } CALL (x) { ... }
COLLECT { (a)-[:R]->(b) } COLLECT { MATCH ... RETURN b }
SET n = {k:v} partial update SET n += {k:v}
DELETE n with relationships DETACH DELETE n
WHERE n.x = null WHERE n.x IS NULL
toInteger(null) throws toIntegerOrNull(null)
n.$key dynamic property n[$key]
SET n:$label SET n:$($label)
ZONED DATETIME >= date(...) → 0 rows Use datetime(...) or .year accessor
FOREACH ... RETURN UNWIND ... RETURN

Full trap table → references/syntax-traps.md


Output Mode and Write Gate

Default: parameterized queries. Return named properties, not full nodes or RETURN *.

// RIGHT: agent gets named fields it can reason over
CYPHER 25 MATCH (n:Organization {name: $name}) RETURN n.name, n.founded, n.industry LIMIT 10

// WRONG: full node object wastes tokens, leaks all properties, agent can't extract fields cleanly
CYPHER 25 MATCH (n:Organization {name: $name}) RETURN n LIMIT 10

Exception: schema/diagnostic queries (CALL db.schema.visualization(), SHOW INDEXES YIELD *, EXPLAIN) where the object is the point.

Validation workflow:

  1. EXPLAIN before any write — catches syntax errors, missing indexes
  2. New read: test with LIMIT 1 first
  3. Write: verify read half as RETURN before replacing with SET/CREATE/DELETE
  4. PROFILE to measure db hits; check for AllNodesScan, CartesianProduct, Eager

Query API v2 (no driver needed — works for schema inspection, EXPLAIN, reads, writes):

curl -X POST https://<instance>.databases.neo4j.io/db/<database>/query/v2 \
  -u <user>:<password> -H "Content-Type: application/json" \
  -d '{"statement": "EXPLAIN MATCH (n:Person {name: $name}) RETURN n", "parameters": {"name": "Alice"}}'
# Local: http://localhost:7474/db/<database>/query/v2
# Response: {"data": {"fields": [...], "values": [...]}}  — prefix EXPLAIN to plan without executing

Write execution gate — only when agent executes (MCP/cypher-shell/HTTP), NOT when generating for code/scripts/user to run:

  1. Run EXPLAIN → report estimated rows affected
  2. Wait for user confirmation before executing

Version Gates

Default to 2025.01-safe features when version unknown.

Feature Min version Fallback
CYPHER 25, QPEs, CALL (x) {} 2025.01 require 2025+
Match modes (DIFFERENT RELATIONSHIPS, REPEATABLE ELEMENTS) 2025.01 require 2025+
Dynamic labels $($expr), coll.sort() 2025.01 APOC or app-side
CONCURRENT TRANSACTIONS, REPORT STATUS 2025.01 drop / omit
SEARCH clause (vector/fulltext) 2026.01 CALL db.index.vector.queryNodes(...) (deprecated 2026.04)
ACYCLIC path mode (no repeated nodes in path) 2026.03 post-filter with size(nodes(p)) = size(apoc.coll.toSet(nodes(p)))
GQL aliases: FOR=UNWIND, PROPERTY_EXISTS=IS NOT NULL, IS [NOT] LABELED=n:Label 2026.03–04 GQL compliance only — use Cypher equivalents

Performance

EXPLAIN/PROFILE red flags: AllNodesScan CartesianProduct NodeByLabelScan Eager

Fix Eager — three approaches (choose simplest that works):

  1. Add specific labels to MATCH nodes to eliminate read/write ambiguity: MATCH (x:CallingPoint) instead of bare MATCH (x) when writing :City nodes
  2. Collect first, then write: WITH collect(u) AS users UNWIND users AS u ...
  3. CALL IN TRANSACTIONS: isolates each batch in its own transaction

Label inference — when planner underestimates selectivity on multi-label queries: [Neo4j 5]

CYPHER inferSchemaParts = most_selective_label
MATCH (admin:Administrator {name: $name}), (resource:Resource {name: $res})
MATCH p=(admin)-[:MEMBER_OF]->()-[:ALLOWED_INHERIT]->(company)
RETURN count(p)

Index anchors: every MATCH/MERGE/WHERE on a property needs an index on the lookup property or Neo4j scans all nodes. Index only activates when the node has a labelMATCH (n {prop: $v}) never uses an index; MATCH (n:Label {prop: $v}) does. MERGE without a constraint has no atomicity guarantee (two concurrent MERGEs can create duplicates). CONTAINS/ENDS WITH → TEXT index (RANGE does not support them). Force a plan with USING INDEX n:Label(prop) when EXPLAIN shows a scan. Chained OPTIONAL MATCH for nested data → replace with COLLECT { MATCH ... RETURN }. Dynamic labels ($($label)) → AllNodesScan+Filter; use static labels when possible.

Full anti-patterns → references/performance.md


Failure Recovery

  • 0 results: check param types, remove WHERE predicates one-by-one, EXPLAIN for index use
  • TypeErrors: use toIntegerOrNull()/toFloatOrNull(); guard with IS NOT NULL
  • Variable out of scope: not listed in WITH → use count(*) not count(droppedVar)
  • Timeouts: fix AllNodesScan → add early LIMITCALL IN TRANSACTIONS OF 1000 ROWS
  • DateTime mismatch: ZONED DATETIME >= date(...) → 0 rows; use datetime() or .year
  • Duration: .inDays/.inMonths don't exist; use .days/.months
  • Cannot merge node using null property value: MERGE key resolved to null — validate params first
  • IndexNotFoundError: SHOW INDEXES YIELD name, state WHERE state <> 'ONLINE'

References

Load on demand:

  • references/indexes.md — index types (RANGE/TEXT/FULLTEXT/POINT/COMPOSITE/LOOKUP), constraints, MERGE lock semantics, fulltext Lucene syntax, import pre-flight
  • references/cypher-syntax.md — full syntax reference: WITH, DELETE, ORDER BY, CASE, null, lists, strings, dates, spatial/point, LOAD CSV, subqueries, QPEs, dynamic labels, SEARCH; conditional CALL (WHEN/THEN/ELSE); label pattern expressions; allReduce; NEXT clause; compact CASE WHEN; normalize(); index/constraint types table; functions annotated with version introduced
  • references/syntax-traps.md — 40+ syntax trap table
  • references/performance.md — anti-patterns, text vs fulltext indexes, Eager (3 fix strategies), label inference, batching best practices, parallel runtime
  • references/advanced-patterns.md — REPEATABLE ELEMENTS patterns, allReduce stateful traversal, multi-stop QPE, route planning simulation, DAG critical path, temporal fraud detection component graph, cycle detection, OPTIONAL CALL
  • references/apoc.md — APOC Core: refactoring, virtual graph, merge helpers, path expanders, triggers, collections, conditional execution

WebFetch

Need URL
Clause semantics https://neo4j.com/docs/cypher-manual/25/clauses/{clause}/
Function signatures https://neo4j.com/docs/cypher-manual/25/functions/{type}/
QPE / paths https://neo4j.com/docs/cypher-manual/25/patterns/
Spatial/point functions https://neo4j.com/docs/cypher-manual/25/functions/spatial/
Index/constraint reference https://neo4j.com/docs/cypher-manual/25/indexes/
Full cheat sheet https://neo4j.com/docs/cypher-cheat-sheet/25/all/

Checklist

  • Schema inspected or confirmed in context
  • CYPHER 25 prefix on every top-level query
  • $parameters used (not literals)
  • LIMIT on exploratory reads (default 25)
  • EXPLAIN run; red flags resolved
  • Write half verified as RETURN before executing
  • Write execution gate applied if agent is executing (not generating)
  • MERGE on constrained key only
  • No label-free MATCH (n)
  • Schema ops not inside explicit transaction
Weekly Installs
136
GitHub Stars
28
First Seen
Today