neo4j-getting-started-skill
Neo4j Getting-Started Skill
Guide a user or agent from zero to a working Neo4j application by executing the 8 stages below in order.
At the start of each stage: read the corresponding ${CLAUDE_SKILL_DIR}/references/<stage-name>.md file and follow its instructions. Only load the stage you are currently executing — not all at once.
"User" means both a human developer and an autonomous coding agent.
When NOT to use this skill
- Writing or optimizing Cypher queries on an existing project → use
neo4j-cypher-skill - Upgrading Neo4j drivers or migrating Cypher syntax → use
neo4j-migration-skill - CLI/admin tasks on an existing database (backup, restore, import) → use
neo4j-cli-tools-skill
Project Structure
All generated code, data, scripts, queries, and notebooks must be written to the working directory so the user can inspect, reuse, and re-run them after the session ends. Never generate output only as text in the conversation — always write it to a file.
Organize files into this layout. Create subdirectories before writing files.
.env ← DB credentials (gitignored, loaded by python-dotenv)
aura.env ← Aura API credentials (gitignored, never overwrite)
progress.md ← stage-by-stage progress (this skill writes it)
requirements.txt ← Python dependencies
schema/
schema.json ← graph model definition
schema.cypher ← DDL: constraints + indexes
reset.cypher ← wipe all data (keep schema)
data/
generate.py ← synthetic data generator (DATA_SOURCE=synthetic)
import.py ← CSV/file importer (DATA_SOURCE=csv or relational)
*.csv ← any provided or generated data files
queries/
queries.cypher ← validated Cypher query library
scripts/
provision_aura.py ← Aura provisioning script (generated during provision stage)
notebook.ipynb ← app artifact (root — standard jupyter convention)
app.py ← app artifact (root — streamlit run app.py)
main.py ← app artifact (root — uvicorn main:app)
graphrag_app.py ← app artifact (root)
Root-level files (.env, requirements.txt, app code) stay at root because tooling expects them there. Everything else goes in the appropriate subfolder.
Progress Tracking
The skill maintains progress.md in the working directory to support resumability.
On startup:
- Check if
progress.mdexists. - If it exists, find the first pending stage:
grep -B1 "^status: pending" progress.md | grep "^###" | head -1 - Resume from that stage. Read its context block (the key=value lines beneath the header) to restore
DOMAIN,USE_CASE,NEO4J_URI, etc. — do not re-ask the user for information already recorded. - For each completed stage, read every file listed in its
files=line before proceeding. These files are the ground truth — do not reconstruct their content from memory.schema/schema.json→ re-read before model, load, query, or build stagesqueries/queries.cypher→ re-read before build stagedata/generate.py→ re-read before import or reset
- If
progress.mddoes not exist, start from0-prerequisites.
On stage completion — update (or create) progress.md:
- If the stage's
###section already exists, updatestatus: pending→status: doneand append any new key=value lines. - If the section doesn't exist, append it following the format below.
Format:
# Neo4j Getting-Started — Progress
<!-- Resume: grep for "status: pending" to find the next stage -->
### 0-prerequisites
status: done
### 1-context
status: done
DOMAIN=social
USE_CASE=friend recommendations
EXPERIENCE=beginner
DB_TARGET=aura-free
DATA_SOURCE=synthetic
APP_TYPE=notebook
EXEC_METHOD=query-api
### 2-provision
status: done
NEO4J_URI=neo4j+s://abc123.databases.neo4j.io
### 3-model
status: done
labels=Person,Post
relationships=FOLLOWS,POSTED
constraints=2
### 4-load
status: done
nodes=200 Person, 50 Post
relationships=1400 FOLLOWS, 300 POSTED
### 5-explore
status: pending
### 6-query
status: pending
### 7-build
status: pending
Execution Protocol
For each stage:
- Announce the stage:
"## Stage: <name> — <purpose>" - Read
${CLAUDE_SKILL_DIR}/references/<name>.md - Execute the instructions in that file
- Verify the stage's completion condition
- Update
progress.mdwithstatus: doneand stage-specific context - Proceed to the next stage (HITL: pause for approval first)
If a stage fails, recover using the error guidance in the stage reference file. Do not skip stages unless the skip condition below explicitly permits it.
Stages
Stages run in the numbered order shown. Each depends on the one before it completing successfully (except where a skip condition applies). Read the linked reference file when entering each stage.
0-prerequisites → 1-context → 2-provision → 3-model → 4-load → 5-explore → 6-query → 7-build
Shared capabilities used across multiple stages:
- Cypher execution:
${CLAUDE_SKILL_DIR}/references/capabilities/execute-cypher.md(3 options;EXEC_METHODchosen incontext) - Cypher authoring rules:
${CLAUDE_SKILL_DIR}/references/capabilities/cypher-authoring.md(or defer toneo4j-cypher-authoring-skill) - MCP configuration:
${CLAUDE_SKILL_DIR}/references/capabilities/mcp-config.md(used inprerequisitesandbuild) - Query validation:
${CLAUDE_SKILL_DIR}/scripts/validate_queries.py— batch-validate all queries in one call (used inquery)
0 — prerequisites
Purpose: Verify and install required CLI tools before doing anything else.
Reference: ${CLAUDE_SKILL_DIR}/references/0-prerequisites.md
Completes when: neo4j-mcp binary is reachable; .gitignore has .env entry.
Never skip.
1 — context
Purpose: Collect domain, use-case, experience, infrastructure target, data source, and output type. Detect EXEC_METHOD for Cypher execution.
Reference: ${CLAUDE_SKILL_DIR}/references/1-context.md
Completes when: DOMAIN, USE_CASE, EXPERIENCE, DB_TARGET, DATA_SOURCE, APP_TYPE, EXEC_METHOD are known.
Skip condition: all variables already provided in conversation context.
2 — provision
Purpose: Provision a running Neo4j database and save credentials to .env.
Reference: ${CLAUDE_SKILL_DIR}/references/2-provision.md
Completes when: .env exists with NEO4J_URI/USERNAME/PASSWORD/DATABASE; connectivity verified.
Skip condition: DB_TARGET=existing → write .env from user credentials, proceed to 3-model.
3 — model
Purpose: Design or discover a graph data model suited to the use-case.
Reference: ${CLAUDE_SKILL_DIR}/references/3-model.md
Completes when: schema.json and schema.cypher written.
Skip condition: DATA_SOURCE=demo → use demo schema, proceed to 4-load.
HITL checkpoint (HITL mode only — skip entirely in autonomous mode): show model draft, wait for approval.
4 — load
Purpose: Apply schema constraints, then import data (demo, synthetic, CSV, or documents).
Reference: ${CLAUDE_SKILL_DIR}/references/4-load.md
Depends on: 3-model (constraints must exist before import).
Completes when: node count ≥ 50; import/ scripts written; reset.cypher written.
5 — explore
Purpose: Deliver a visual entry point to the graph — the "it clicks" moment.
Reference: ${CLAUDE_SKILL_DIR}/references/5-explore.md
Completes when: browser URL printed to user, or notebook visualization cell added.
Hard gate — never skip.
6 — query
Purpose: Generate and validate a Cypher query library for the use-case.
Reference: ${CLAUDE_SKILL_DIR}/references/6-query.md
Completes when: queries.cypher has ≥5 queries; ≥2 traversals; ≥3 return results.
7 — build
Purpose: Generate a runnable application, dashboard, notebook, or agent integration.
Reference: ${CLAUDE_SKILL_DIR}/references/7-build.md
Completes when: artifact exists, passes syntax check, returns non-empty use-case results.
Success Gates (all 7 required)
| Gate | Stage | Condition |
|---|---|---|
db_running |
provision | driver.verify_connectivity() succeeds |
model_valid |
model | ≥2 node labels, ≥1 rel type, ≥1 constraint in DB |
data_present |
load | MATCH (n) RETURN count(n) ≥ 50 |
queries_work |
query | ≥5 queries; ≥2 traversals; ≥3 return ≥1 result |
graph_visible |
explore | Browser URL or notebook viz delivered to user |
app_generated |
build | Artifact exists, passes syntax, returns non-empty results |
integration_ready |
build | MCP config or agent framework code present (if requested) |
Fast Paths
| Situation | Action |
|---|---|
DB_TARGET=existing |
Skip provision; write .env from user creds; go to model |
DATA_SOURCE=demo |
Skip custom modeling; use demo schema; jump to load |
DB_TARGET=existing + data present |
Skip provision, model, load; introspect schema; go to explore |
HITL vs Autonomous Mode
HITL (conversational): pause after model for model review; pause after load for data review.
Autonomous (CI-like, all context provided upfront): never pause for approval at any stage; auto-approve all decisions; proceed immediately through all 8 stages; print browser URL to stdout; target ≤15 min from DB running.
How to detect autonomous mode — check at the start of stage 1:
Autonomous if ANY of the following are true:
- The initial prompt contains all of:
DOMAIN,USE_CASE,EXPERIENCE,DB_TARGET,DATA_SOURCE,APP_TYPE(or equivalent phrasing like "Domain: X, use-case: Y, ...") - The session was started with
--auto-approveor similar non-interactive flag - All context variables are already recorded in
progress.md(resuming an autonomous run)
HITL if: the user opened a fresh conversation without providing full context upfront.
In autonomous mode: every HITL checkpoint in every stage reference file is automatically skipped. Do not ask for approval. Do not say "does this look right?" Do not pause. Continue to the next step immediately.
Final Summary (deliver after all gates pass)
Step 1 — write README.md to the working directory using the template below.
Fill in every <placeholder> from progress.md and the actual generated files.
This is a required output — do not skip it.
IMPORTANT — portable commands: All re-run commands in README.md MUST use python3 (never an absolute path like /opt/homebrew/bin/python3.14 or /usr/local/bin/python3). The README is shared with others who have different Python installs.
# <DOMAIN> Graph — <USE_CASE>
A synthetic <DOMAIN> graph built with Neo4j, covering <USE_CASE>.
Generated by the neo4j-getting-started-skill on <date>.
## What's in the graph
| Label | Count | Description |
|-------|-------|-------------|
| <Label> | <N> | <one line> |
**Relationships:** <TYPE1>, <TYPE2>, ...
**Constraints:** <N> uniqueness constraints applied
## Explore visually
Open in Neo4j Browser:
<browser_url>
Use `NEO4J_PASSWORD` from `.env` to connect, then run:
```cypher
// Starter query — shows the full graph sample
MATCH (n)-[r]->(m) RETURN n, r, m LIMIT 50
Files
| File | Purpose | Re-run |
|---|---|---|
schema/schema.json |
Graph model | — |
schema/schema.cypher |
Constraints + indexes | source .env && cypher-shell -a $NEO4J_URI -u $NEO4J_USERNAME -p $NEO4J_PASSWORD --file schema/schema.cypher |
schema/reset.cypher |
Wipe data, keep schema | source .env && cypher-shell -a $NEO4J_URI -u $NEO4J_USERNAME -p $NEO4J_PASSWORD --file schema/reset.cypher |
data/generate.py |
Regenerate synthetic data | source .venv/bin/activate && python3 data/generate.py |
data/import.py |
Re-import CSVs into Neo4j | source .venv/bin/activate && python3 data/import.py |
queries/queries.cypher |
Query library | Paste into Neo4j Browser |
<artifact> |
<run command> |
|
requirements.txt |
Python dependencies | source .venv/bin/activate && pip install -r requirements.txt |
(Omit data/generate.py row when DATA_SOURCE=csv; omit data/import.py row when DATA_SOURCE=synthetic.)
Run the app
python3 -m venv .venv # skip if .venv already exists
source .venv/bin/activate
pip install -r requirements.txt
<run command>
<For FastAPI only — include this section:> Open http://localhost:8000/docs for the interactive API docs.
<For MCP integration — include this section when APP_TYPE includes mcp:>
MCP integration
To query your graph directly from Claude:
Claude Code — copy mcp-claude-code.json into .claude/settings.json:
cp mcp-claude-code.json .claude/settings.json
Then reload Claude Code (/reload or restart). Ask: "What node labels are in my Neo4j database?"
Claude Desktop — merge mcp-claude-desktop.json into
~/Library/Application Support/Claude/claude_desktop_config.json, then restart Claude Desktop.
Available MCP tools: get-schema, read-cypher, write-cypher.
Reset and reload
source .env
cypher-shell -a $NEO4J_URI -u $NEO4J_USERNAME -p $NEO4J_PASSWORD --file schema/reset.cypher
source .venv/bin/activate
python3 data/generate.py # or skip if using your own CSVs
python3 data/import.py
Sample queries
// <use-case-specific query 1 — fill in from queries/queries.cypher>
<query>
// <use-case-specific query 2>
<query>
(Cypher comments use //, not --.)
Next steps
- Explore GraphAcademy to deepen your Neo4j knowledge
- Edit
data/*.csvto change the dataset, then re-rundata/import.py - Extend the model: add new node labels or relationship types in
schema/schema.json
**Step 2 — print this to the conversation:**
✓ Neo4j Getting-Started — Complete
Database: <NEO4J_URI> Browser: https://browser.neo4j.io/?connectURL=
── What was generated (keep these files) ─────────────────────── schema/schema.json Graph model definition schema/schema.cypher Re-apply constraints/indexes: cypher-shell ... --file schema/schema.cypher schema/reset.cypher Wipe data, keep schema: cypher-shell ... --file schema/reset.cypher data/generate.py Regenerate synthetic data: source .venv/bin/activate && python3 data/generate.py data/*.csv Source data files — edit to change the dataset data/import.py Re-import from CSVs: source .venv/bin/activate && python3 data/import.py queries/queries.cypher Query library — paste into Neo4j Browser or run with cypher-shell requirements.txt Install deps: source .venv/bin/activate && pip install -r requirements.txt
── Gates ─────────────────────────────────────────────────────── db_running ✓ model_valid ✓ data_present ✓ queries_work ✓ graph_visible ✓ app_generated ✓ integration_ready ✓/–
── Next steps ──────────────────────────────────────────────────
- Explore: open the Browser URL → run MATCH (n)-[r]->(m) RETURN n,r,m LIMIT 50
- Iterate: edit data/*.csv → source .venv/bin/activate && python3 data/import.py (reset first)
- Learn: https://graphacademy.neo4j.com
Omit lines that don't apply (e.g. omit `data/import.py` when `DATA_SOURCE=synthetic`,
omit `data/generate.py` when `DATA_SOURCE=csv`).