memory-management
Memory Management
Memory management provides agents with a "brain" that persists beyond the current context window. It involves storing user preferences, conversation history, and factual knowledge in a database (like a Vector DB or SQL) and retrieving relevant information when needed. Without memory, an agent is amnesic, resetting after every session.
When to Use
- Personalization: Remembering user names, preferences, and past choices.
- Long-Running Tasks: Tracking progress on a project that spans days or weeks.
- Context Awareness: Understanding references to previous conversations ("As I mentioned earlier...").
- Learning: Improving performance by recalling past mistakes or feedback.
Use Cases
- Chatbots: Maintaining conversation history for context (Short-term memory).
- User Profiles: Storing "User is a vegetarian" in a profile database (Long-term memory).
- Knowledge Base: Accumulating facts learned from searching the web (Episodic memory).
Implementation Pattern
class Memory:
def add(self, content):
# Store in Vector DB or SQL
pass
def retrieve(self, query):
# Search for relevant memories
pass
def memory_augmented_agent(user_input, user_id):
# Step 1: Recall
# Retrieve relevant history and user facts
context = memory.retrieve(query=user_input, tags=[user_id])
# Step 2: Augment Prompt
prompt = f"""
Context from memory: {context}
User Input: {user_input}
Answer the user, taking into account their history.
"""
# Step 3: Generate
response = llm.generate(prompt)
# Step 4: Memorize
# Store the new interaction
memory.add(f"User: {user_input} | Agent: {response}")
return response
Examples
Input: A customer support agent needs to remember user preferences across sessions.
# Write to memory
memory.store("user:123:preferences", {"language": "Spanish", "tone": "formal"})
# Retrieve on next session
prefs = memory.retrieve("user:123:preferences")
response = agent.run(prompt, context=prefs)
Output: The agent greets the user in Spanish using formal language, without requiring them to re-specify preferences.
Input: "My agent keeps forgetting what we discussed earlier in a long conversation."
Fix: Implement a sliding window summary: every 10 turns, summarize the conversation so far and store it as a compressed context document. Inject this summary at the start of each new prompt.
Troubleshooting
| Problem | Cause | Fix |
|---|---|---|
| Agent retrieves wrong memories | Similarity threshold too low | Raise cosine similarity threshold to ≥0.8 for semantic retrieval |
| Memory grows unbounded | No expiry policy | Implement TTL on episodic memory; archive after 30 days |
| Context window overflow | Too much memory injected | Use summarization; only inject top-3 most relevant memories |
| Agent ignores stored memories | Memory not injected into prompt | Ensure retrieved context is passed before the user message, not after |
| Stale preferences causing errors | No invalidation on update | Add a last_modified timestamp; re-retrieve if > N days old |