neo4j-cypher-skill
When to Use
- Writing, optimizing, or debugging Cypher queries
- Graph pattern matching, QPEs, variable-length paths
- Vector/fulltext search, subqueries, batch writes, LOAD CSV
When NOT to Use
- Driver migration/API changes →
neo4j-migration-skill - DB admin (users, config, backups) →
neo4j-cli-tools-skill
GQL conformance note: LET, FINISH, FILTER, and INSERT are valid Cypher 25 clauses (introduced via GQL conformance, mostly in Neo4j 2025.06). On older versions, fall back to WITH / (omit RETURN) / WHERE / CREATE. INSERT requires &-separated multi-labels and does not support dynamic labels/types.
Pre-flight
| ? | Known | Unknown |
|---|---|---|
| Schema | Use directly | Run Schema-First Protocol |
| Neo4j version | Use version features | Default to 2025.01 safe set |
| Executing (not generating)? | Use EXPLAIN + write gate | State query is unvalidated |
Schema unknown + no tool → produce non-executable sketch outside a code block:
(<SOURCE_LABEL> {<KEY>: $value})-[:<REL_TYPE>]->(<TARGET_LABEL>)
Never fill guessed names — realistic guesses get copied blindly.
Defaults — apply every query
CYPHER 25— first token; never repeat afterUNIONor inside subqueries- Schema first — inspect before writing; if schema in prompt, use it directly
MERGEon constrained key only; relMERGEon already-bound endpoints only- Label-free
MATCH (n)forbidden unless bound or followed byWHERE n:$($label) LIMIT 25default on all exploratory reads; pushWITH n LIMITbefore high-cardinality operations (variable-length traversals, fan-out MATCH, Cartesian products)- Comments:
//only —--is SQL, invalid REPEATABLE ELEMENTS/DIFFERENT RELATIONSHIPSgo afterMATCH, not end of patternSHOWcommands:YIELDbeforeWHERE; noUNION- Inline node predicates
(:Label WHERE p=x)— valid inMATCHonly WHEREcannot follow bareUNWIND— useWITH x WHERE(a)-[:R]-(b)— undirected matches both directions, double-counts; use directed unless unknownDETACH DELETE— plainDELETEthrows if node has relationships
Style
| Element | Convention |
|---|---|
| Node labels | PascalCase :Person |
| Rel types | SCREAMING_SNAKE_CASE :KNOWS |
| Properties/vars | camelCase firstName |
| Clauses | UPPERCASE MATCH |
| Booleans/null | lowercase true false null |
| Strings | single-quoted; double only if contains ' |
Schema is truth.
:Person,:KNOWS,namein examples are illustrative — substitute real names from schema.
Schema-First Protocol
Schema in context → use it, skip inspection.
Schema missing → run:
CALL db.schema.visualization() YIELD nodes, relationships RETURN nodes, relationships;
SHOW INDEXES YIELD name, type, labelsOrTypes, properties, state WHERE state = 'ONLINE';
SHOW CONSTRAINTS YIELD name, type, labelsOrTypes, properties;
SHOW PROCEDURES YIELD name RETURN split(name,'.')[0] AS namespace, count(*) AS procedures;
Property types per label — check APOC first:
// If APOC available (preferred — use this):
CALL apoc.meta.schema() YIELD value RETURN value;
// No APOC AND database ≤ 100k nodes/rels only (expensive on large graphs):
CALL db.schema.nodeTypeProperties() YIELD nodeType, propertyName, propertyTypes, mandatory;
CALL db.schema.relTypeProperties() YIELD relType, propertyName, propertyTypes, mandatory;
Validate before returning any query: label exists · rel type+direction correct · property on that label · index ONLINE.
Key Patterns
MERGE
// MERGE on constrained key; set extras in ON CREATE/ON MATCH
CYPHER 25
MATCH (a:Person {id: $a}) MATCH (b:Person {id: $b})
MERGE (a)-[r:KNOWS]->(b)
ON CREATE SET r.since = date()
ON MATCH SET r.lastSeen = date()
SET n = {} replaces all props. SET n += {} merges (safe partial update). Use += for updates.
WITH scope
CYPHER 25
MATCH (a:Person)-[:KNOWS]->(b:Person)
WITH a, count(*) AS friends // b dropped here
WHERE friends > 5
RETURN a.name, friends ORDER BY friends DESC
Every var not listed in WITH is dropped. WITH * carries all forward.
Subqueries — cheat sheet
EXISTS { (a)-[:R]->(b) } // boolean check
COUNT { (a)-[:R]->(b) WHERE a.x > 0 } // count
COLLECT { MATCH (a)-[:R]->(b) RETURN b.name } // collect list (full MATCH+RETURN required)
CALL (p) { MATCH (p)-[:ACTED_IN]->(m) RETURN m } // correlated subquery (explicit import)
OPTIONAL CALL (p) { ... } // nullable subquery
CALL { WITH x ... } deprecated → CALL (x) { ... }. COLLECT {} returns exactly one column.
CALL IN TRANSACTIONS (bulk writes)
CYPHER 25
LOAD CSV WITH HEADERS FROM 'file:///data.csv' AS row
CALL (row) {
MERGE (p:Person {id: row.id}) SET p += row
} IN TRANSACTIONS OF 1000 ROWS ON ERROR CONTINUE REPORT STATUS AS s
Input stream must be outside subquery. Auto-commit only — never wrap in beginTransaction(). PERIODIC COMMIT deprecated.
QPE basics
CYPHER 25
MATCH SHORTEST 1 (a:Person {name:'Alice'})(()-[:KNOWS]->()){1,}(b:Person {name:'Bob'})
RETURN b.name
// ACYCLIC [2026.03] — no repeated nodes within a path (prevents cycles)
CYPHER 25
MATCH p = ACYCLIC (start:Router {name: $from})-[:LINK]-+(end:Router {name: $to})
RETURN [n IN nodes(p) | n.name] AS route
ORDER BY length(p) LIMIT 5
Quantifier outside group: (pattern){N,M}. Groups start+end with node. REPEATABLE ELEMENTS needs bounded {m,n}. ACYCLIC implies nodes cannot repeat within a path (stronger than default DIFFERENT RELATIONSHIPS).
Match mode — add after MATCH:
DIFFERENT RELATIONSHIPS(default) — each rel traversed once per pathREPEATABLE ELEMENTS[2025.x] — nodes/rels revisitable; use for circular routes, weight-optimized paths, constrained backtracking; requires bounded{m,n}
Conditional CALL subqueries [2025.06]
CYPHER 25
MATCH (move:Item {id: $id})
OPTIONAL MATCH (insertBefore:Item {id: $before})
OPTIONAL MATCH (insertAfter:Item {id: $after})
CALL (move, insertBefore, insertAfter) {
WHEN insertBefore IS NULL THEN {
MATCH (last:Item) WHERE NOT (last)-[:NEXT]->() AND last <> move
CREATE (last)-[:NEXT]->(move)
}
WHEN insertAfter IS NULL THEN {
CREATE (move)-[:NEXT]->(insertBefore)
}
ELSE {
CREATE (insertAfter)-[:NEXT]->(move)
CREATE (move)-[:NEXT]->(insertBefore)
}
}
Use WHEN…THEN…ELSE for if-else-if write logic; mutually exclusive (first match wins). Not available pre-2025.06.
Dynamic relationship types [2025.x]
// Create/match/merge with dynamic rel type (must resolve to exactly one STRING)
CYPHER 25 CREATE (a:Node)-[:$($relType)]->(b:Node)
CYPHER 25 MATCH (a:Node)-[:$($relType)]->(b:Node) RETURN a.name, b.name
Spatial / Point
// WGS84 geographic point
SET n.coords = point({longitude: $lon, latitude: $lat})
// Distance in metres; requires POINT index for performance
MATCH (a:Place {name: $origin}) MATCH (b:Place)
RETURN b.name, point.distance(a.coords, b.coords) AS distM
ORDER BY distM LIMIT 10
// Bounding-box pre-filter (uses POINT index) then distance
MATCH (b:Place)
WHERE point.withinBBox(b.coords,
point({longitude: $west, latitude: $south}),
point({longitude: $east, latitude: $north}))
RETURN b.name, point.distance(b.coords, $origin) AS distM
Create POINT index: CREATE POINT INDEX name IF NOT EXISTS FOR (n:Place) ON (n.coords)
Aggregation grouping keys
Non-aggregating expressions in RETURN/WITH are implicit grouping keys — no GROUP BY needed:
// actor + director are grouping keys; count(*) is the aggregate
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
RETURN a.name, d.name, count(*) AS collaborations
ORDER BY collaborations DESC
count(n) counts non-null; count(*) counts rows including nulls. collect(DISTINCT expr) deduplicates.
count() is faster than size(collect()) — count() reads the internal store; collect() builds a list first.
Common Syntax Traps (top causes of broken queries)
| Wrong | Right |
|---|---|
ORDER BY n.prop AS x DESC |
ORDER BY n.prop DESC |
ORDER BY preAggVar after agg RETURN |
Use RETURN alias |
count(r WHERE r.x=5) |
sum(CASE WHEN r.x=5 THEN 1 ELSE 0 END) |
UNWIND list AS x WHERE x>5 |
UNWIND list AS x WITH x WHERE x>5 |
least(a,b) / greatest(a,b) |
CASE WHEN a<b THEN a ELSE b END |
-- comment |
// comment |
shortestPath((a)-[*]->(b)) |
SHORTEST 1 (a)(()-[]->()){1,}(b) |
id(n) |
elementId(n) |
[:REL*1..5] |
(()-[:REL]->()){1,5} |
CALL { WITH x ... } |
CALL (x) { ... } |
COLLECT { (a)-[:R]->(b) } |
COLLECT { MATCH ... RETURN b } |
SET n = {k:v} partial update |
SET n += {k:v} |
DELETE n with relationships |
DETACH DELETE n |
WHERE n.x = null |
WHERE n.x IS NULL |
toInteger(null) throws |
toIntegerOrNull(null) |
n.$key dynamic property |
n[$key] |
SET n:$label |
SET n:$($label) |
ZONED DATETIME >= date(...) → 0 rows |
Use datetime(...) or .year accessor |
FOREACH ... RETURN |
UNWIND ... RETURN |
Full trap table → references/syntax-traps.md
Output Mode and Write Gate
Default: parameterized queries. Return named properties, not full nodes or RETURN *.
// RIGHT: agent gets named fields it can reason over
CYPHER 25 MATCH (n:Organization {name: $name}) RETURN n.name, n.founded, n.industry LIMIT 10
// WRONG: full node object wastes tokens, leaks all properties, agent can't extract fields cleanly
CYPHER 25 MATCH (n:Organization {name: $name}) RETURN n LIMIT 10
Exception: schema/diagnostic queries (CALL db.schema.visualization(), SHOW INDEXES YIELD *, EXPLAIN) where the object is the point.
Validation workflow:
EXPLAINbefore any write — catches syntax errors, missing indexes- New read: test with
LIMIT 1first - Write: verify read half as
RETURNbefore replacing withSET/CREATE/DELETE PROFILEto measure db hits; check forAllNodesScan,CartesianProduct,Eager
Query API v2 (no driver needed — works for schema inspection, EXPLAIN, reads, writes):
curl -X POST https://<instance>.databases.neo4j.io/db/<database>/query/v2 \
-u <user>:<password> -H "Content-Type: application/json" \
-d '{"statement": "EXPLAIN MATCH (n:Person {name: $name}) RETURN n", "parameters": {"name": "Alice"}}'
# Local: http://localhost:7474/db/<database>/query/v2
# Response: {"data": {"fields": [...], "values": [...]}} — prefix EXPLAIN to plan without executing
Write execution gate — only when agent executes (MCP/cypher-shell/HTTP), NOT when generating for code/scripts/user to run:
- Run
EXPLAIN→ report estimated rows affected - Wait for user confirmation before executing
Version Gates
Default to 2025.01-safe features when version unknown.
| Feature | Min version | Fallback |
|---|---|---|
CYPHER 25, QPEs, CALL (x) {} |
2025.01 | require 2025+ |
Match modes (DIFFERENT RELATIONSHIPS, REPEATABLE ELEMENTS) |
2025.01 | require 2025+ |
Dynamic labels $($expr), coll.sort() |
2025.01 | APOC or app-side |
CONCURRENT TRANSACTIONS, REPORT STATUS |
2025.01 | drop / omit |
SEARCH clause (vector/fulltext) |
2026.01 | CALL db.index.vector.queryNodes(...) (deprecated 2026.04) |
ACYCLIC path mode (no repeated nodes in path) |
2026.03 | post-filter with size(nodes(p)) = size(apoc.coll.toSet(nodes(p))) |
GQL aliases: FOR=UNWIND, PROPERTY_EXISTS=IS NOT NULL, IS [NOT] LABELED=n:Label |
2026.03–04 | GQL compliance only — use Cypher equivalents |
Performance
EXPLAIN/PROFILE red flags: AllNodesScan CartesianProduct NodeByLabelScan Eager
Fix Eager — three approaches (choose simplest that works):
- Add specific labels to MATCH nodes to eliminate read/write ambiguity:
MATCH (x:CallingPoint)instead of bareMATCH (x)when writing:Citynodes - Collect first, then write:
WITH collect(u) AS users UNWIND users AS u ... - CALL IN TRANSACTIONS: isolates each batch in its own transaction
Label inference — when planner underestimates selectivity on multi-label queries: [Neo4j 5]
CYPHER inferSchemaParts = most_selective_label
MATCH (admin:Administrator {name: $name}), (resource:Resource {name: $res})
MATCH p=(admin)-[:MEMBER_OF]->()-[:ALLOWED_INHERIT]->(company)
RETURN count(p)
Index anchors: every MATCH/MERGE/WHERE on a property needs an index on the lookup property or Neo4j scans all nodes. Index only activates when the node has a label — MATCH (n {prop: $v}) never uses an index; MATCH (n:Label {prop: $v}) does. MERGE without a constraint has no atomicity guarantee (two concurrent MERGEs can create duplicates). CONTAINS/ENDS WITH → TEXT index (RANGE does not support them). Force a plan with USING INDEX n:Label(prop) when EXPLAIN shows a scan.
Chained OPTIONAL MATCH for nested data → replace with COLLECT { MATCH ... RETURN }.
Dynamic labels ($($label)) → AllNodesScan+Filter; use static labels when possible.
Full anti-patterns → references/performance.md
Failure Recovery
- 0 results: check param types, remove WHERE predicates one-by-one, EXPLAIN for index use
- TypeErrors: use
toIntegerOrNull()/toFloatOrNull(); guard withIS NOT NULL - Variable out of scope: not listed in
WITH→ usecount(*)notcount(droppedVar) - Timeouts: fix AllNodesScan → add early
LIMIT→CALL IN TRANSACTIONS OF 1000 ROWS - DateTime mismatch:
ZONED DATETIME >= date(...)→ 0 rows; usedatetime()or.year - Duration:
.inDays/.inMonthsdon't exist; use.days/.months Cannot merge node using null property value: MERGE key resolved to null — validate params firstIndexNotFoundError:SHOW INDEXES YIELD name, state WHERE state <> 'ONLINE'
References
Load on demand:
- references/indexes.md — index types (RANGE/TEXT/FULLTEXT/POINT/COMPOSITE/LOOKUP), constraints, MERGE lock semantics, fulltext Lucene syntax, import pre-flight
- references/cypher-syntax.md — full syntax reference: WITH, DELETE, ORDER BY, CASE, null, lists, strings, dates, spatial/point, LOAD CSV, subqueries, QPEs, dynamic labels, SEARCH; conditional CALL (WHEN/THEN/ELSE); label pattern expressions; allReduce; NEXT clause; compact CASE WHEN; normalize(); index/constraint types table; functions annotated with version introduced
- references/syntax-traps.md — 40+ syntax trap table
- references/performance.md — anti-patterns, text vs fulltext indexes, Eager (3 fix strategies), label inference, batching best practices, parallel runtime
- references/advanced-patterns.md — REPEATABLE ELEMENTS patterns, allReduce stateful traversal, multi-stop QPE, route planning simulation, DAG critical path, temporal fraud detection component graph, cycle detection, OPTIONAL CALL
- references/apoc.md — APOC Core: refactoring, virtual graph, merge helpers, path expanders, triggers, collections, conditional execution
WebFetch
| Need | URL |
|---|---|
| Clause semantics | https://neo4j.com/docs/cypher-manual/25/clauses/{clause}/ |
| Function signatures | https://neo4j.com/docs/cypher-manual/25/functions/{type}/ |
| QPE / paths | https://neo4j.com/docs/cypher-manual/25/patterns/ |
| Spatial/point functions | https://neo4j.com/docs/cypher-manual/25/functions/spatial/ |
| Index/constraint reference | https://neo4j.com/docs/cypher-manual/25/indexes/ |
| Full cheat sheet | https://neo4j.com/docs/cypher-cheat-sheet/25/all/ |
Checklist
- Schema inspected or confirmed in context
-
CYPHER 25prefix on every top-level query -
$parametersused (not literals) -
LIMITon exploratory reads (default 25) -
EXPLAINrun; red flags resolved - Write half verified as
RETURNbefore executing - Write execution gate applied if agent is executing (not generating)
-
MERGEon constrained key only - No label-free
MATCH (n) - Schema ops not inside explicit transaction