knowledge-graph-builder
Knowledge Graph Builder
This skill provides guidance for designing knowledge graphs that capture entities, relationships, and semantic meaning for powerful querying and reasoning.
Core Competencies
- Graph Modeling: Entity-relationship design for graphs
- Query Languages: Cypher (Neo4j), SPARQL (RDF), Gremlin
- Ontology Design: Schema, taxonomies, semantic relationships
- Graph Algorithms: Pathfinding, centrality, community detection
Knowledge Graph Fundamentals
What Makes a Knowledge Graph
Knowledge Graph = Entities + Relationships + Schema + Semantics
Traditional Database: Knowledge Graph:
┌────────────────────┐ ┌─────────────────────────────┐
│ Tables with rows │ │ (Person)──KNOWS──▶(Person) │
│ Foreign keys │ vs │ │ │
│ JOIN operations │ │ WORKS_AT │
│ │ │ ▼ │
└────────────────────┘ │ (Company)──IN──▶(Industry) │
└─────────────────────────────┘
When to Use Knowledge Graphs
| Use Case | Why Graphs Excel |
|---|---|
| Recommendation systems | Traverse connections to find related items |
| Fraud detection | Identify suspicious relationship patterns |
| Knowledge management | Connect concepts and infer relationships |
| Master data management | Unify entities across systems |
| Root cause analysis | Follow causal chains through dependencies |
Graph Data Modeling
Entity Design
Identify core entities (nodes):
// Person entity with properties
CREATE (p:Person {
id: 'p001',
name: 'Alice Chen',
email: 'alice@example.com',
created_at: datetime()
})
// Multiple labels for categorization
CREATE (c:Organization:Company:TechCompany {
id: 'c001',
name: 'Acme Corp',
founded: 2010
})
Relationship Design
Model connections with typed, directed edges:
// Simple relationship
(person)-[:WORKS_AT]->(company)
// Relationship with properties
(person)-[:WORKS_AT {
role: 'Engineer',
start_date: date('2020-01-15'),
department: 'Engineering'
}]->(company)
// Temporal relationships
(person)-[:EMPLOYED_BY {
from: date('2018-01-01'),
to: date('2020-12-31')
}]->(company1)
(person)-[:EMPLOYED_BY {
from: date('2021-01-01')
}]->(company2)
Common Relationship Patterns
Hierarchical: (Child)──IS_CHILD_OF──▶(Parent)
(Employee)──REPORTS_TO──▶(Manager)
Associative: (Person)──KNOWS──▶(Person)
(Document)──REFERENCES──▶(Document)
Temporal: (Event)──PRECEDES──▶(Event)
(Version)──SUPERSEDES──▶(Version)
Categorical: (Product)──BELONGS_TO──▶(Category)
(Concept)──IS_A──▶(Category)
Spatial: (Location)──NEAR──▶(Location)
(Region)──CONTAINS──▶(City)
Schema Definition
// Node constraints
CREATE CONSTRAINT person_id IF NOT EXISTS
FOR (p:Person) REQUIRE p.id IS UNIQUE;
CREATE CONSTRAINT company_id IF NOT EXISTS
FOR (c:Company) REQUIRE c.id IS UNIQUE;
// Property existence
CREATE CONSTRAINT person_name IF NOT EXISTS
FOR (p:Person) REQUIRE p.name IS NOT NULL;
// Indexes for query performance
CREATE INDEX person_name_idx IF NOT EXISTS
FOR (p:Person) ON (p.name);
CREATE INDEX company_industry_idx IF NOT EXISTS
FOR (c:Company) ON (c.industry);
Cypher Query Patterns
Basic Traversal
// Find all colleagues (people who work at same company)
MATCH (person:Person {name: 'Alice Chen'})-[:WORKS_AT]->(company)
<-[:WORKS_AT]-(colleague:Person)
WHERE colleague <> person
RETURN colleague.name, company.name
// Variable-length paths (1-3 hops)
MATCH path = (start:Person)-[:KNOWS*1..3]->(end:Person)
WHERE start.name = 'Alice Chen' AND end.name = 'Bob Smith'
RETURN path, length(path) as hops
Aggregation
// Count relationships
MATCH (p:Person)-[:WORKS_AT]->(c:Company)
RETURN c.name, count(p) as employee_count
ORDER BY employee_count DESC
// Collect into lists
MATCH (p:Person)-[:HAS_SKILL]->(s:Skill)
RETURN p.name, collect(s.name) as skills
Recommendations
// "People you may know" - friends of friends
MATCH (me:Person {id: $userId})-[:KNOWS]-(friend)-[:KNOWS]-(suggestion)
WHERE NOT (me)-[:KNOWS]-(suggestion) AND me <> suggestion
RETURN suggestion.name, count(friend) as mutual_friends
ORDER BY mutual_friends DESC
LIMIT 10
// Content-based: similar interests
MATCH (me:Person {id: $userId})-[:INTERESTED_IN]->(topic)
<-[:INTERESTED_IN]-(similar:Person)
WHERE me <> similar
WITH similar, count(topic) as shared_interests
ORDER BY shared_interests DESC
RETURN similar.name, shared_interests
LIMIT 10
Path Analysis
// Shortest path
MATCH path = shortestPath(
(start:Person {name: 'Alice'})-[:KNOWS*]-(end:Person {name: 'Bob'})
)
RETURN path, length(path)
// All shortest paths
MATCH path = allShortestPaths(
(start:Person)-[:KNOWS*]-(end:Person)
)
WHERE start.name = 'Alice' AND end.name = 'Bob'
RETURN path
Graph Algorithms
Centrality Measures
| Algorithm | Purpose | Use Case |
|---|---|---|
| Degree | Connection count | Find popular nodes |
| Betweenness | Bridge detection | Find brokers/bottlenecks |
| PageRank | Influence propagation | Rank importance |
| Closeness | Average distance | Find well-connected nodes |
// Using Neo4j Graph Data Science
CALL gds.pageRank.stream('myGraph')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC
LIMIT 10
Community Detection
// Louvain for community detection
CALL gds.louvain.stream('myGraph')
YIELD nodeId, communityId
RETURN communityId, collect(gds.util.asNode(nodeId).name) as members
ORDER BY size(members) DESC
Knowledge Graph Patterns
Entity Resolution
// Find potential duplicates
MATCH (p1:Person), (p2:Person)
WHERE p1.id < p2.id
AND (p1.email = p2.email
OR (p1.name = p2.name AND p1.birth_date = p2.birth_date))
RETURN p1, p2
// Merge duplicates
MATCH (p1:Person {id: 'keep'}), (p2:Person {id: 'duplicate'})
CALL apoc.refactor.mergeNodes([p1, p2], {
properties: 'combine',
mergeRels: true
})
YIELD node
RETURN node
Semantic Layering
┌─────────────────────────────────────────────────────┐
│ Instance Layer │
│ (Alice)──KNOWS──▶(Bob) │
│ (Alice)──WORKS_AT──▶(Acme) │
├─────────────────────────────────────────────────────┤
│ Schema Layer │
│ (:Person)──CAN_KNOW──▶(:Person) │
│ (:Person)──CAN_WORK_AT──▶(:Company) │
├─────────────────────────────────────────────────────┤
│ Ontology Layer │
│ (Person)──IS_A──▶(Agent) │
│ (Company)──IS_A──▶(Organization) │
└─────────────────────────────────────────────────────┘
Temporal Modeling
// State over time
CREATE (person)-[:HAS_STATE {
valid_from: date('2020-01-01'),
valid_to: date('2020-12-31')
}]->(state:PersonState {
status: 'employed',
salary: 80000
})
// Query state at point in time
MATCH (p:Person {id: $personId})-[r:HAS_STATE]->(s)
WHERE r.valid_from <= date($queryDate)
AND (r.valid_to IS NULL OR r.valid_to >= date($queryDate))
RETURN s
Best Practices
Modeling Guidelines
- Prefer relationships over properties when the connection has meaning
- Use specific relationship types (
:MANAGESnot:RELATED_TO) - Model for your queries - understand access patterns first
- Keep properties atomic - no arrays for searchable data
- Version nodes, not graphs - temporal properties on relationships
Performance Tips
- Index properties used in WHERE clauses
- Use parameters ($userId) not string concatenation
- Limit variable-length paths (*1..5 not *)
- Profile queries with EXPLAIN and PROFILE
- Consider relationship direction in traversals
References
references/cypher-patterns.md- Advanced Cypher query examplesreferences/graph-modeling.md- Entity and relationship design patternsreferences/graph-algorithms.md- Algorithm selection and configuration
More from 4444j99/a-i--skills
creative-writing-craft
Craft compelling fiction and creative nonfiction with attention to structure, voice, prose style, and revision. Supports short stories, novel chapters, essays, and hybrid forms. Triggers on creative writing, fiction writing, story craft, prose style, or literary technique requests.
184skill-creator
Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.
15freelance-client-ops
Manage freelance and client work professionally—proposals, contracts, scope management, invoicing, and client communication. Covers the business side of creative work. Triggers on freelance, client work, proposals, contracts, pricing, or project scope requests.
14generative-music-composer
Creates algorithmic music composition systems using procedural generation, Markov chains, L-systems, and neural approaches for ambient, adaptive, and experimental music.
12generative-art-algorithms
Create algorithmic and generative art using mathematical patterns, noise functions, particle systems, and procedural generation. Covers flow fields, L-systems, fractals, and creative coding foundations. Triggers on generative art, algorithmic art, creative coding, procedural generation, or mathematical visualization requests.
10interfaith-sacred-geometry
Generate sacred geometry patterns with interfaith symbolism for spiritual visualizations and art. Use when creating visual representations that honor multiple religious traditions, designing meditation aids, building soul journey visualizations, or producing art that bridges sacred traditions through geometric harmony. Triggers on sacred geometry requests, interfaith symbol design, spiritual visualization projects, or multi-tradition sacred art.
8