skills/neo4j-contrib/neo4j-skills/neo4j-getting-started-skill

neo4j-getting-started-skill

Installation
SKILL.md

Neo4j Getting-Started Skill

Guide a user or agent from zero to a working Neo4j application by executing the 8 stages below in order.

At the start of each stage: read the corresponding ${CLAUDE_SKILL_DIR}/references/<stage-name>.md file and follow its instructions. Only load the stage you are currently executing — not all at once.

"User" means both a human developer and an autonomous coding agent.


When NOT to use this skill

  • Writing or optimizing Cypher queries on an existing project → use neo4j-cypher-skill
  • Upgrading Neo4j drivers or migrating Cypher syntax → use neo4j-migration-skill
  • CLI/admin tasks on an existing database (backup, restore, import) → use neo4j-cli-tools-skill

Project Structure

All generated code, data, scripts, queries, and notebooks must be written to the working directory so the user can inspect, reuse, and re-run them after the session ends. Never generate output only as text in the conversation — always write it to a file.

Organize files into this layout. Create subdirectories before writing files.

.env                    ← DB credentials (gitignored, loaded by python-dotenv)
aura.env                ← Aura API credentials (gitignored, never overwrite)
progress.md             ← stage-by-stage progress (this skill writes it)
requirements.txt        ← Python dependencies

schema/
  schema.json           ← graph model definition
  schema.cypher         ← DDL: constraints + indexes
  reset.cypher          ← wipe all data (keep schema)

data/
  generate.py           ← synthetic data generator  (DATA_SOURCE=synthetic)
  import.py             ← CSV/file importer          (DATA_SOURCE=csv or relational)
  *.csv                 ← any provided or generated data files

queries/
  queries.cypher        ← validated Cypher query library

scripts/
  provision_aura.py     ← Aura provisioning script (generated during provision stage)

notebook.ipynb          ← app artifact (root — standard jupyter convention)
app.py                  ← app artifact (root — streamlit run app.py)
main.py                 ← app artifact (root — uvicorn main:app)
graphrag_app.py         ← app artifact (root)

Root-level files (.env, requirements.txt, app code) stay at root because tooling expects them there. Everything else goes in the appropriate subfolder.


Progress Tracking

The skill maintains progress.md in the working directory to support resumability.

On startup:

  1. Check if progress.md exists.
  2. If it exists, find the first pending stage:
    grep -B1 "^status: pending" progress.md | grep "^###" | head -1
    
  3. Resume from that stage. Read its context block (the key=value lines beneath the header) to restore DOMAIN, USE_CASE, NEO4J_URI, etc. — do not re-ask the user for information already recorded.
  4. For each completed stage, read every file listed in its files= line before proceeding. These files are the ground truth — do not reconstruct their content from memory.
    • schema/schema.json → re-read before model, load, query, or build stages
    • queries/queries.cypher → re-read before build stage
    • data/generate.py → re-read before import or reset
  5. If progress.md does not exist, start from 0-prerequisites.

On stage completion — update (or create) progress.md:

  • If the stage's ### section already exists, update status: pendingstatus: done and append any new key=value lines.
  • If the section doesn't exist, append it following the format below.

Format:

# Neo4j Getting-Started — Progress
<!-- Resume: grep for "status: pending" to find the next stage -->

### 0-prerequisites
status: done

### 1-context
status: done
DOMAIN=social
USE_CASE=friend recommendations
EXPERIENCE=beginner
DB_TARGET=aura-free
DATA_SOURCE=synthetic
APP_TYPE=notebook
EXEC_METHOD=query-api

### 2-provision
status: done
NEO4J_URI=neo4j+s://abc123.databases.neo4j.io

### 3-model
status: done
labels=Person,Post
relationships=FOLLOWS,POSTED
constraints=2

### 4-load
status: done
nodes=200 Person, 50 Post
relationships=1400 FOLLOWS, 300 POSTED

### 5-explore
status: pending

### 6-query
status: pending

### 7-build
status: pending

Execution Protocol

For each stage:

  1. Announce the stage: "## Stage: <name> — <purpose>"
  2. Read ${CLAUDE_SKILL_DIR}/references/<name>.md
  3. Execute the instructions in that file
  4. Verify the stage's completion condition
  5. Update progress.md with status: done and stage-specific context
  6. Proceed to the next stage (HITL: pause for approval first)

If a stage fails, recover using the error guidance in the stage reference file. Do not skip stages unless the skip condition below explicitly permits it.


Stages

Stages run in the numbered order shown. Each depends on the one before it completing successfully (except where a skip condition applies). Read the linked reference file when entering each stage.

0-prerequisites → 1-context → 2-provision → 3-model → 4-load → 5-explore → 6-query → 7-build

Shared capabilities used across multiple stages:

  • Cypher execution: ${CLAUDE_SKILL_DIR}/references/capabilities/execute-cypher.md (3 options; EXEC_METHOD chosen in context)
  • Cypher authoring rules: ${CLAUDE_SKILL_DIR}/references/capabilities/cypher-authoring.md (or defer to neo4j-cypher-authoring-skill)
  • MCP configuration: ${CLAUDE_SKILL_DIR}/references/capabilities/mcp-config.md (used in prerequisites and build)
  • Query validation: ${CLAUDE_SKILL_DIR}/scripts/validate_queries.py — batch-validate all queries in one call (used in query)

0 — prerequisites

Purpose: Verify and install required CLI tools before doing anything else.
Reference: ${CLAUDE_SKILL_DIR}/references/0-prerequisites.md
Completes when: neo4j-mcp binary is reachable; .gitignore has .env entry.
Never skip.


1 — context

Purpose: Collect domain, use-case, experience, infrastructure target, data source, and output type. Detect EXEC_METHOD for Cypher execution.
Reference: ${CLAUDE_SKILL_DIR}/references/1-context.md
Completes when: DOMAIN, USE_CASE, EXPERIENCE, DB_TARGET, DATA_SOURCE, APP_TYPE, EXEC_METHOD are known.
Skip condition: all variables already provided in conversation context.


2 — provision

Purpose: Provision a running Neo4j database and save credentials to .env.
Reference: ${CLAUDE_SKILL_DIR}/references/2-provision.md
Completes when: .env exists with NEO4J_URI/USERNAME/PASSWORD/DATABASE; connectivity verified.
Skip condition: DB_TARGET=existing → write .env from user credentials, proceed to 3-model.


3 — model

Purpose: Design or discover a graph data model suited to the use-case.
Reference: ${CLAUDE_SKILL_DIR}/references/3-model.md
Completes when: schema.json and schema.cypher written.
Skip condition: DATA_SOURCE=demo → use demo schema, proceed to 4-load.
HITL checkpoint (HITL mode only — skip entirely in autonomous mode): show model draft, wait for approval.


4 — load

Purpose: Apply schema constraints, then import data (demo, synthetic, CSV, or documents).
Reference: ${CLAUDE_SKILL_DIR}/references/4-load.md
Depends on: 3-model (constraints must exist before import).
Completes when: node count ≥ 50; import/ scripts written; reset.cypher written.


5 — explore

Purpose: Deliver a visual entry point to the graph — the "it clicks" moment.
Reference: ${CLAUDE_SKILL_DIR}/references/5-explore.md
Completes when: browser URL printed to user, or notebook visualization cell added.
Hard gate — never skip.


6 — query

Purpose: Generate and validate a Cypher query library for the use-case.
Reference: ${CLAUDE_SKILL_DIR}/references/6-query.md
Completes when: queries.cypher has ≥5 queries; ≥2 traversals; ≥3 return results.


7 — build

Purpose: Generate a runnable application, dashboard, notebook, or agent integration.
Reference: ${CLAUDE_SKILL_DIR}/references/7-build.md
Completes when: artifact exists, passes syntax check, returns non-empty use-case results.


Success Gates (all 7 required)

Gate Stage Condition
db_running provision driver.verify_connectivity() succeeds
model_valid model ≥2 node labels, ≥1 rel type, ≥1 constraint in DB
data_present load MATCH (n) RETURN count(n) ≥ 50
queries_work query ≥5 queries; ≥2 traversals; ≥3 return ≥1 result
graph_visible explore Browser URL or notebook viz delivered to user
app_generated build Artifact exists, passes syntax, returns non-empty results
integration_ready build MCP config or agent framework code present (if requested)

Fast Paths

Situation Action
DB_TARGET=existing Skip provision; write .env from user creds; go to model
DATA_SOURCE=demo Skip custom modeling; use demo schema; jump to load
DB_TARGET=existing + data present Skip provision, model, load; introspect schema; go to explore

HITL vs Autonomous Mode

HITL (conversational): pause after model for model review; pause after load for data review.

Autonomous (CI-like, all context provided upfront): never pause for approval at any stage; auto-approve all decisions; proceed immediately through all 8 stages; print browser URL to stdout; target ≤15 min from DB running.

How to detect autonomous mode — check at the start of stage 1:

Autonomous if ANY of the following are true:

  • The initial prompt contains all of: DOMAIN, USE_CASE, EXPERIENCE, DB_TARGET, DATA_SOURCE, APP_TYPE (or equivalent phrasing like "Domain: X, use-case: Y, ...")
  • The session was started with --auto-approve or similar non-interactive flag
  • All context variables are already recorded in progress.md (resuming an autonomous run)

HITL if: the user opened a fresh conversation without providing full context upfront.

In autonomous mode: every HITL checkpoint in every stage reference file is automatically skipped. Do not ask for approval. Do not say "does this look right?" Do not pause. Continue to the next step immediately.


Final Summary (deliver after all gates pass)

Step 1 — write README.md to the working directory using the template below. Fill in every <placeholder> from progress.md and the actual generated files. This is a required output — do not skip it.

IMPORTANT — portable commands: All re-run commands in README.md MUST use python3 (never an absolute path like /opt/homebrew/bin/python3.14 or /usr/local/bin/python3). The README is shared with others who have different Python installs.

# <DOMAIN> Graph — <USE_CASE>

A synthetic <DOMAIN> graph built with Neo4j, covering <USE_CASE>.
Generated by the neo4j-getting-started-skill on <date>.

## What's in the graph

| Label | Count | Description |
|-------|-------|-------------|
| <Label> | <N> | <one line> |

**Relationships:** <TYPE1>, <TYPE2>, ...  
**Constraints:** <N> uniqueness constraints applied

## Explore visually

Open in Neo4j Browser:
<browser_url>

Use `NEO4J_PASSWORD` from `.env` to connect, then run:
```cypher
// Starter query — shows the full graph sample
MATCH (n)-[r]->(m) RETURN n, r, m LIMIT 50

Files

File Purpose Re-run
schema/schema.json Graph model
schema/schema.cypher Constraints + indexes source .env && cypher-shell -a $NEO4J_URI -u $NEO4J_USERNAME -p $NEO4J_PASSWORD --file schema/schema.cypher
schema/reset.cypher Wipe data, keep schema source .env && cypher-shell -a $NEO4J_URI -u $NEO4J_USERNAME -p $NEO4J_PASSWORD --file schema/reset.cypher
data/generate.py Regenerate synthetic data source .venv/bin/activate && python3 data/generate.py
data/import.py Re-import CSVs into Neo4j source .venv/bin/activate && python3 data/import.py
queries/queries.cypher Query library Paste into Neo4j Browser
<artifact> <run command>
requirements.txt Python dependencies source .venv/bin/activate && pip install -r requirements.txt

(Omit data/generate.py row when DATA_SOURCE=csv; omit data/import.py row when DATA_SOURCE=synthetic.)

Run the app

python3 -m venv .venv     # skip if .venv already exists
source .venv/bin/activate
pip install -r requirements.txt
<run command>

<For FastAPI only — include this section:> Open http://localhost:8000/docs for the interactive API docs.

<For MCP integration — include this section when APP_TYPE includes mcp:>

MCP integration

To query your graph directly from Claude:

Claude Code — copy mcp-claude-code.json into .claude/settings.json:

cp mcp-claude-code.json .claude/settings.json

Then reload Claude Code (/reload or restart). Ask: "What node labels are in my Neo4j database?"

Claude Desktop — merge mcp-claude-desktop.json into ~/Library/Application Support/Claude/claude_desktop_config.json, then restart Claude Desktop.

Available MCP tools: get-schema, read-cypher, write-cypher.

Reset and reload

source .env
cypher-shell -a $NEO4J_URI -u $NEO4J_USERNAME -p $NEO4J_PASSWORD --file schema/reset.cypher
source .venv/bin/activate
python3 data/generate.py   # or skip if using your own CSVs
python3 data/import.py

Sample queries

// <use-case-specific query 1 — fill in from queries/queries.cypher>
<query>

// <use-case-specific query 2>
<query>

(Cypher comments use //, not --.)

Next steps

  • Explore GraphAcademy to deepen your Neo4j knowledge
  • Edit data/*.csv to change the dataset, then re-run data/import.py
  • Extend the model: add new node labels or relationship types in schema/schema.json

**Step 2 — print this to the conversation:**

✓ Neo4j Getting-Started — Complete

Database: <NEO4J_URI> Browser: https://browser.neo4j.io/?connectURL=

── What was generated (keep these files) ─────────────────────── schema/schema.json Graph model definition schema/schema.cypher Re-apply constraints/indexes: cypher-shell ... --file schema/schema.cypher schema/reset.cypher Wipe data, keep schema: cypher-shell ... --file schema/reset.cypher data/generate.py Regenerate synthetic data: source .venv/bin/activate && python3 data/generate.py data/*.csv Source data files — edit to change the dataset data/import.py Re-import from CSVs: source .venv/bin/activate && python3 data/import.py queries/queries.cypher Query library — paste into Neo4j Browser or run with cypher-shell requirements.txt Install deps: source .venv/bin/activate && pip install -r requirements.txt

── Gates ─────────────────────────────────────────────────────── db_running ✓ model_valid ✓ data_present ✓ queries_work ✓ graph_visible ✓ app_generated ✓ integration_ready ✓/–

── Next steps ──────────────────────────────────────────────────

  • Explore: open the Browser URL → run MATCH (n)-[r]->(m) RETURN n,r,m LIMIT 50
  • Iterate: edit data/*.csv → source .venv/bin/activate && python3 data/import.py (reset first)
  • Learn: https://graphacademy.neo4j.com

Omit lines that don't apply (e.g. omit `data/import.py` when `DATA_SOURCE=synthetic`,
omit `data/generate.py` when `DATA_SOURCE=csv`).
Weekly Installs
16
GitHub Stars
28
First Seen
2 days ago