Knowledge Graph Builder

Overview

Knowledge graphs make implicit relationships explicit, enabling AI systems to reason about connections, verify facts, and reduce hallucinations. They combine structured entity-relationship modeling with semantic search for powerful knowledge retrieval.

When to use: Complex entity relationships central to the domain, verifying AI-generated facts against structured knowledge, semantic search combined with relationship traversal, recommendation systems, fraud detection, or pattern recognition.

When NOT to use: Simple tabular data (use a relational database), purely document-based search with no relationships (use the rag-implementer skill), read-heavy workloads with no traversal needs, or when the team lacks graph modeling expertise. For KB architecture selection and governance, use the knowledge-base-manager skill.

Quick Reference

Pattern	Approach	Key Points
Ontology first	Define entity types, relationships, properties before ingesting data	Changing schema later is expensive; validate with domain experts
Entity resolution	Deduplicate aggressively during extraction	"Apple Inc" = "Apple" = "Apple Computer" must resolve to one entity
Confidence scoring	Attach 0.0-1.0 score + source to every relationship	Enables filtering by reliability, critical for AI grounding
Hybrid architecture	Graph traversal (structured) + vector search (semantic)	Vector finds candidates, graph expands context via relationships
Incremental build	Core entities first, validate against target queries, then expand	Avoid building the full graph before testing with real queries
Database selection	Neo4j (general), Neptune (AWS managed), ArangoDB (multi-model), TigerGraph (massive scale)	Match database to scale, infrastructure, and query complexity

Common Mistakes

Mistake	Correct Pattern
Ingesting entities before designing the ontology	Define and validate the ontology with domain experts first; changing later is expensive
Skipping entity resolution and deduplication	Deduplicate aggressively so "Apple Inc", "Apple", and "Apple Computer" resolve to one entity
Omitting confidence scores on relationships	Attach a 0.0-1.0 confidence score and source to every relationship
Using only graph traversal without vector search	Implement hybrid architecture combining graph traversal with semantic vector search
Building the full graph before validating with real queries	Start with core entities, test against target queries, then expand incrementally
Choosing a database before understanding scale requirements	Evaluate query patterns, data volume, and infrastructure constraints before selecting

Delegation

Extract entities and relationships from unstructured text: Use Task agent to run NER pipelines and build relationship triples
Evaluate graph database options for project requirements: Use Explore agent to compare Neo4j, Neptune, ArangoDB, and TigerGraph against scale and query needs
Design ontology and hybrid architecture for a new domain: Use Plan agent to define entity types, relationship schemas, and graph-vector integration strategy
For hybrid KG+RAG systems, delegate to the rag-implementer skill
For knowledge-graph-powered agent workflows, delegate to the agent-patterns skill

References