grey-haven-memory-systems
SKILL.md
Memory Systems Skill
Design and implement long-term memory systems for AI agents.
The Context-Memory Spectrum
Memory exists on a spectrum from ephemeral to permanent:
Ephemeral ◄────────────────────────────────────► Permanent
Context Window Short-term Long-term Knowledge
(disappears) Cache Memory Base
(session) (weeks) (forever)
When to Use What
| Memory Type | Duration | Use Case |
|---|---|---|
| Context window | Single turn | Immediate task context |
| Short-term cache | Session | Conversation history |
| Long-term memory | Weeks/months | User preferences, learnings |
| Knowledge base | Permanent | Facts, documentation, procedures |
Memory Architecture Options
1. Vector RAG (Retrieval-Augmented Generation)
Store embeddings, retrieve by semantic similarity.
Pros:
- Simple to implement
- Works well for document retrieval
- Scales to millions of documents
Cons:
- No relationships between items
- Recency bias (older memories fade)
- Can retrieve irrelevant but similar content
Best for: Document search, FAQ systems, code search
2. Knowledge Graphs
Store entities and relationships explicitly.
Pros:
- Captures relationships
- Supports reasoning
- No similarity confusion
Cons:
- Complex to build and maintain
- Requires structured data
- More expensive queries
Best for: Domain modeling, reasoning tasks, complex queries
3. Temporal Knowledge Graphs
Knowledge graphs with time-based relationships.
Pros:
- Tracks how knowledge evolves
- Supports "as of" queries
- Captures causality
Cons:
- Most complex option
- Storage grows over time
- Query complexity
Best for: Historical analysis, change tracking, audit trails
4. Hybrid Approaches
Combine vector + graph for best of both:
Query ──▶ Vector Search ──▶ Top K candidates
│
▼
Graph Traversal ──▶ Related entities
│
▼
Re-ranking ──▶ Final results
Performance Benchmarks
Research benchmarks for memory systems (2024 data):
| System | Recall@10 | Latency (P50) | Cost/Query |
|---|---|---|---|
| Zep | 94.8% | 45ms | $0.0001 |
| MemGPT | 93.4% | 120ms | $0.0003 |
| LangChain Memory | 87.2% | 80ms | $0.0002 |
| Simple RAG | 78.5% | 30ms | $0.00005 |
Key Insights
- Zep excels at conversation memory with entity extraction
- MemGPT best for complex reasoning over memory
- Simple RAG sufficient for most document retrieval
- Hybrid approaches win for complex queries
What's Included
Examples (examples/)
- Conversation memory - Storing and retrieving chat history
- Entity memory - Tracking entities mentioned in conversations
- Knowledge base integration - Connecting to Grey Haven KB
Reference Guides (reference/)
- Architecture patterns - When to use each memory type
- Embedding strategies - Chunking, models, dimensions
- Grey Haven integration - Using with knowledge-base agents
Checklists (checklists/)
- Memory system selection - Choose the right architecture
- Implementation checklist - Before deploying memory
Grey Haven Knowledge Base Agents
This skill complements the knowledge-base agents:
| Agent | Purpose |
|---|---|
memory-architect |
Design memory storage, semantic search |
knowledge-curator |
Create and organize knowledge entries |
ontology-builder |
Map relationships between entries |
kb-search-analyzer |
Search and synthesize from KB |
kb-entry-creator |
Create structured KB entries |
kb-validator |
Validate KB integrity |
kb-manifest-generator |
Generate KB indexes |
kb-ontology-mapper |
Visualize knowledge structure |
Implementation Patterns
Pattern 1: Conversation Memory
class ConversationMemory:
def __init__(self):
self.short_term = [] # Last N messages
self.long_term = VectorStore() # Semantic search
self.entities = EntityStore() # Mentioned entities
def add_message(self, message: str, role: str):
# Short-term: sliding window
self.short_term.append({"role": role, "content": message})
if len(self.short_term) > 20:
self.short_term.pop(0)
# Long-term: embed and store
self.long_term.add(message, metadata={"role": role})
# Entity extraction
entities = extract_entities(message)
self.entities.update(entities)
def retrieve(self, query: str, k: int = 5) -> list:
# Combine short-term + relevant long-term
recent = self.short_term[-5:]
similar = self.long_term.search(query, k=k)
entities = self.entities.get_relevant(query)
return {
"recent": recent,
"similar": similar,
"entities": entities
}
Pattern 2: Entity Memory
class EntityMemory:
def __init__(self):
self.entities = {} # entity_name -> EntityRecord
self.relationships = [] # (entity1, relation, entity2)
def update(self, entity: str, info: dict):
if entity not in self.entities:
self.entities[entity] = EntityRecord(entity)
self.entities[entity].update(info)
self.entities[entity].last_mentioned = now()
def get_context(self, entity: str) -> str:
if entity not in self.entities:
return ""
record = self.entities[entity]
related = self.get_relationships(entity)
return f"""
Entity: {entity}
Type: {record.type}
Properties: {record.properties}
Related: {related}
Last mentioned: {record.last_mentioned}
"""
Pattern 3: Tiered Memory
class TieredMemory:
def __init__(self):
self.hot = LRUCache(100) # Frequent access
self.warm = VectorStore() # Semantic search
self.cold = PersistentStore() # Rarely accessed
def get(self, key: str):
# Check hot first
if key in self.hot:
return self.hot[key]
# Then warm
result = self.warm.get(key)
if result:
self.hot[key] = result # Promote
return result
# Finally cold
result = self.cold.get(key)
if result:
self.warm.add(key, result) # Promote
return result
return None
Use This Skill When
- Designing persistent memory for AI agents
- Implementing RAG systems
- Building knowledge management systems
- Choosing between vector vs graph approaches
- Optimizing memory retrieval performance
- Integrating with Grey Haven knowledge base
Related Skills
context-management- Managing context in workflowsdata-modeling- Designing memory data structuresllm-project-development- Building LLM applications
Quick Start
# Understand architecture options
cat reference/architecture-patterns.md
# See implementation examples
cat examples/conversation-memory.md
# Use selection checklist
cat checklists/memory-selection-checklist.md
Skill Version: 1.0 Key Benchmark: Zep 94.8% recall, 45ms latency Related Agents: 8 knowledge-base agents Last Updated: 2025-01-15
Weekly Installs
2
Repository
greyhaven-ai/cl…e-configGitHub Stars
20
First Seen
Feb 5, 2026
Installed on
opencode2
amp1
trae1
cursor1
kimi-cli1
codex1