neo4j-graphrag-skill
Installation
SKILL.md
Neo4j GraphRAG Skill
Status: Draft / WIP — Content is a placeholder. Reference files for retrieval patterns to be added.
When to Use
- Building GraphRAG retrieval pipelines with
neo4j-graphragPython package - Choosing between VectorRetriever, VectorCypherRetriever, HybridCypherRetriever
- Writing
retrieval_queryCypher fragments that traverse the graph after vector lookup - Constructing a knowledge graph from documents with
SimpleKGPipeline - Integrating Neo4j with LangChain (
langchain-neo4j), LlamaIndex, or Haystack - Debugging low retrieval quality (when to use graph traversal vs plain vector)
When NOT to Use
- Plain vector/semantic search without graph traversal → use
neo4j-vector-search-skill - GDS algorithms (PageRank, Louvain, embeddings) → use
neo4j-gds-skill - Agent long-term memory → use
neo4j-agent-memory-skill - Document chunking + loading only → use
neo4j-document-import-skill
Retriever Selection
Question involves multi-hop, co-occurrence, or relational reasoning?
→ YES: HybridCypherRetriever (best) or VectorCypherRetriever
→ NO: HybridRetriever (keyword + semantic) or VectorRetriever (baseline)
Have fulltext index? YES → include Hybrid variants (better recall)
Need graph context after retrieval? YES → include Cypher variants
| Retriever | Vector | Fulltext | Graph traversal | When to use |
|---|---|---|---|---|
VectorRetriever |
yes | no | no | Baseline — quick start |
HybridRetriever |
yes | yes | no | Better recall, no graph |
VectorCypherRetriever |
yes | no | yes | GraphRAG without fulltext |
HybridCypherRetriever |
yes | yes | yes | Production GraphRAG |
Package Name
pip install neo4j-graphrag openai # or any supported LLM/embedder
# IMPORTANT: old package was `neo4j-genai` — uninstall it if present
# pip uninstall neo4j-genai && pip install neo4j-graphrag
# Import paths changed: neo4j_graphrag.retrievers (not neo4j_genai.retrievers)
Prerequisites (run once before ingesting)
-- Fulltext index (required for Hybrid retrievers)
CREATE FULLTEXT INDEX chunk_fulltext IF NOT EXISTS
FOR (c:Chunk) ON EACH [c.text];
-- Vector index (required for all retrievers)
CREATE VECTOR INDEX chunk_embedding IF NOT EXISTS
FOR (c:Chunk) ON (c.embedding)
OPTIONS { indexConfig: { `vector.dimensions`: 1536, `vector.similarity_function`: 'cosine' } };
-- Confirm indexes are ONLINE before ingesting:
SHOW INDEXES YIELD name, state WHERE name IN ['chunk_fulltext','chunk_embedding']
RETURN name, state; -- must be 'ONLINE'
Core Pattern
from neo4j import GraphDatabase
from neo4j_graphrag.retrievers import HybridCypherRetriever
from neo4j_graphrag.embeddings import OpenAIEmbeddings
from neo4j_graphrag.generation import GraphRAG
from neo4j_graphrag.llm import OpenAILLM
driver = GraphDatabase.driver("neo4j+s://<host>", auth=("neo4j", "<password>"))
embedder = OpenAIEmbeddings()
# retrieval_query: Cypher fragment executed after vector lookup.
# `node` and `score` are AUTO-INJECTED by the retriever — do NOT declare them.
# Additional parameters can be passed via query_params={} in retriever.search().
retrieval_query = """
MATCH (node)<-[:HAS_CHUNK]-(article:Article)
OPTIONAL MATCH (article)-[:MENTIONS]->(org:Organization)
RETURN node.text AS chunk_text,
article.title AS article_title,
collect(DISTINCT org.name) AS mentioned_organizations,
score
"""
retriever = HybridCypherRetriever(
driver=driver,
vector_index_name="chunk_embedding",
fulltext_index_name="chunk_fulltext",
retrieval_query=retrieval_query,
embedder=embedder,
)
rag = GraphRAG(retriever=retriever, llm=OpenAILLM(model_name="gpt-4o"))
print(rag.search("Who does Alice work for?").answer)
Knowledge graph construction
from neo4j_graphrag.experimental.pipeline.kg_builder import SimpleKGPipeline
import asyncio
pipeline = SimpleKGPipeline(
llm=OpenAILLM(model_name="gpt-4o"),
driver=driver,
embedder=embedder,
entities=["Person", "Organization", "Location"],
relations=["WORKS_AT", "LOCATED_IN", "KNOWS"],
on_error="IGNORE",
)
asyncio.run(pipeline.run_async(text=document_text))
Embedding Dimension Note
Embedding dimensions must match the vector index. If you switch embedding models, drop and recreate the vector index and re-embed all chunks. Changing vector.dimensions on an existing index is not supported.
Checklist
- Vector index and fulltext index created before ingesting data
-
retrieval_queryusesnodeandscorevariables (provided by retriever) -
retrieval_queryreturns at leastscorecolumn - Embedding dimensions match
vector.dimensionsin index config -
query_paramspassed toretriever.search()whenretrieval_queryuses named params -
neo4j-genai(old name) replaced withneo4j-graphragin requirements
Fetching Current Docs
https://neo4j.com/docs/llms.txt ← full documentation index
https://neo4j.com/llms-full.txt ← rich reference with code examples