lakebase-memory-patterns
SKILL.md
Lakebase Memory Patterns for Stateful Agents
When to Use
Use this skill when implementing stateful agents that need:
- Short-term memory: Conversation continuity within a session (thread_id)
- Long-term memory: User preferences and insights across sessions (user_id)
- Graceful degradation: Agent works without memory tables
- Thread ID resolution: Priority-based thread_id extraction from request context
Upstream: Lakebase Provisioned Patterns
The upstream databricks-lakebase-provisioned skill in AI-Dev-Kit provides comprehensive Lakebase patterns:
OAuth Token Refresh (Production)
Tokens expire after 1 hour. Production apps MUST implement token refresh:
from databricks.sdk import WorkspaceClient
import uuid
w = WorkspaceClient()
cred = w.database.generate_database_credential(
request_id=str(uuid.uuid4()),
instance_names=["my-lakebase-instance"]
)
token = cred.token # Use as password, expires in 1 hour
For long-running apps, implement a background refresh loop every 50 minutes.
Databricks Apps Integration
Apps use environment variables for Lakebase configuration:
LAKEBASE_INSTANCE_NAME/LAKEBASE_DATABASE_NAME— set automatically by Databricks Apps- Use
databricks apps add-resourceCLI to bind Lakebase to an app
MLflow Model Resources
Declare Lakebase as a model resource for automatic credential provisioning:
from mlflow.models.resources import DatabricksLakebase
resources = [DatabricksLakebase(database_instance_name="my-lakebase-instance")]
SDK Version Requirements
databricks-sdk >= 0.61.0(0.81.0+ recommended for full API support)psycopg >= 3.0(supportshostaddrfor DNS workaround)SQLAlchemy 2.xwithpostgresql+psycopgdriver
Capacity Sizing
Lakebase Provisioned uses compute unit sizing: CU_1, CU_2, CU_4, CU_8.
Two-Layer Memory Architecture
| Layer | Component | Use Case | Lifecycle |
|---|---|---|---|
| Short-term | CheckpointSaver | Conversation continuity within session | Thread-based, session-scoped |
| Long-term | DatabricksStore | User preferences across sessions | User-based, persistent |
Core Principles
1. Unity Catalog-Backed Storage
- All memory stored in Unity Catalog Delta tables
- Governed, auditable, queryable
- Automatic schema management via
.setup() - TTL-based cleanup for GDPR compliance
2. Graceful Degradation
- Memory is optional enhancement, not requirement
- Agent works without memory tables
- Silent fallback if tables don't exist
- No failures due to missing memory
Quick Setup
Short-Term Memory
from agents.memory import ShortTermMemory
# Initialize
memory = ShortTermMemory(instance_name="my_lakebase")
# Setup tables (run once)
memory.setup()
# Use with LangGraph
with memory.get_checkpointer() as checkpointer:
graph = workflow.compile(checkpointer=checkpointer)
Thread ID Resolution
# Priority: custom_inputs > conversation_id > new UUID
thread_id = ShortTermMemory.resolve_thread_id(
custom_inputs=request.custom_inputs,
conversation_id=context.conversation_id
)
# Build LangGraph config
config = {"configurable": {"thread_id": thread_id}}
result = graph.invoke(messages, config=config)
Long-Term Memory
from agents.memory import LongTermMemory
# Initialize
memory = LongTermMemory(
instance_name="my_lakebase",
embedding_endpoint="databricks-gte-large-en",
embedding_dims=1024
)
# Setup tables (run once)
memory.setup()
# Save user preference
memory.save_memory(
user_id="user@example.com",
memory_key="preferred_workspace",
memory_data={"workspace_id": "12345"}
)
# Search memories
results = memory.search_memories(
user_id="user@example.com",
query="What workspace does the user prefer?",
limit=5
)
Common Mistakes to Avoid
❌ DON'T: Assume Tables Exist
# BAD: Will fail if tables not created
class Agent:
def __init__(self):
self.memory = ShortTermMemory()
with self.memory.get_checkpointer() as checkpointer:
self.graph = workflow.compile(checkpointer=checkpointer)
✅ DO: Graceful Degradation
# GOOD: Falls back gracefully
class Agent:
def __init__(self):
try:
self.memory = ShortTermMemory()
with self.memory.get_checkpointer() as checkpointer:
self.graph = workflow.compile(checkpointer=checkpointer)
except Exception as e:
print(f"⚠ Memory unavailable, using stateless mode: {e}")
self.graph = workflow.compile() # No checkpointer
❌ DON'T: Forget to Return thread_id
# BAD: Client can't track conversation
return {
"choices": [{"message": {"content": response}}]
}
✅ DO: Return thread_id for Client Tracking
# GOOD: Client can continue conversation
return {
"choices": [{"message": {"content": response}}],
"custom_outputs": {
"thread_id": thread_id, # ✅ Return for next turn
}
}
❌ DON'T: Hardcode User IDs
# BAD: Hardcoded user
memories = store.search_memories("hardcoded@example.com", query)
✅ DO: Extract from Context
# GOOD: Dynamic user from request context
user_id = context.get("user_id") or custom_inputs.get("user_id") or "unknown"
memories = store.search_memories(user_id, query)
Validation Checklist
Before deploying agent with memory:
- Lakebase instance name configured in settings
- Setup script run once (creates tables)
- Short-term memory uses
CheckpointSaverwith context manager - Long-term memory uses
DatabricksStorewith embeddings - Embedding endpoint configured (e.g.,
databricks-gte-large-en) - Embedding dimensions match model (1024 for GTE-large)
- Thread ID resolution: custom_inputs → conversation_id → new UUID
- User ID extracted from context/custom_inputs
- Graceful degradation if tables don't exist
- thread_id returned in custom_outputs for client tracking
- Memory tools created with
create_memory_tools()if autonomous use - MLflow tracing enabled for memory operations
References
Detailed Patterns
- Short-Term Memory - Complete CheckpointSaver implementation
- Long-Term Memory - Complete DatabricksStore implementation
- Graceful Degradation - Fallback patterns
Setup Scripts
- Setup Lakebase - Table initialization script
Official Documentation
Weekly Installs
1
Repository
databricks-solu…templateGitHub Stars
2
First Seen
8 days ago
Security Audits
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1