model-configuration
Letta Model Configuration
Patterns for configuring LLM models on Letta agents via SDK/API. Covers model handles, settings, provider-specific configuration, and custom endpoints.
When to Use This Skill
Use this skill when:
- Creating agents with specific model configurations
- Adjusting model settings (temperature, max tokens, context window)
- Configuring provider-specific features (OpenAI reasoning, Anthropic thinking)
- Setting up custom OpenAI-compatible endpoints
- Changing models on existing agents
- Configuring embedding models for self-hosted deployments
Not covered here: Model selection advice (which model to choose) - see agent-development skill's references/model-recommendations.md.
Model Handles
Models use a provider/model-name format:
| Provider | Handle Prefix | Example |
|---|---|---|
| OpenAI | openai/ |
openai/gpt-4o, openai/gpt-4o-mini |
| Anthropic | anthropic/ |
anthropic/claude-sonnet-4-5-20250929 |
| Google AI | google_ai/ |
google_ai/gemini-2.0-flash |
| Azure OpenAI | azure/ |
azure/gpt-4o |
| AWS Bedrock | bedrock/ |
bedrock/anthropic.claude-3-5-sonnet |
| Groq | groq/ |
groq/llama-3.3-70b-versatile |
| Together | together/ |
together/meta-llama/Llama-3-70b |
| OpenRouter | openrouter/ |
openrouter/anthropic/claude-3.5-sonnet |
| Ollama (local) | ollama/ |
ollama/llama3.2 |
Basic Model Configuration
Python
from letta_client import Letta
client = Letta(api_key="your-api-key")
agent = client.agents.create(
model="openai/gpt-4o",
model_settings={
"provider_type": "openai", # Required - must match model provider
"temperature": 0.7,
"max_output_tokens": 4096,
},
context_window_limit=128000
)
TypeScript
import Letta from "@letta-ai/letta-client";
const client = new Letta({ apiKey: "your-api-key" });
const agent = await client.agents.create({
model: "openai/gpt-4o",
model_settings: {
provider_type: "openai", // Required - must match model provider
temperature: 0.7,
max_output_tokens: 4096,
},
context_window_limit: 128000,
});
Common Settings
| Setting | Type | Description |
|---|---|---|
provider_type |
string | Required. Must match model provider (openai, anthropic, google_ai, etc.) |
temperature |
float | Controls randomness (0.0-2.0). Lower = more deterministic. |
max_output_tokens |
int | Maximum tokens in the response. |
Context Window Limit
Set at agent level (not inside model_settings):
agent = client.agents.create(
model="anthropic/claude-sonnet-4-5-20250929",
context_window_limit=200000 # Use 200K of Claude's context
)
Important:
- Must be <= model's maximum context size
- Default: 32,000 tokens if not specified
- Larger windows increase latency and may reduce reliability
- When context fills up, Letta automatically summarizes older messages
Changing an Agent's Model
Update existing agents with agents.update():
Python
# Change model only
client.agents.update(
agent_id=agent.id,
model="anthropic/claude-sonnet-4-5-20250929"
)
# Change model and settings
client.agents.update(
agent_id=agent.id,
model="openai/gpt-4o",
model_settings={
"provider_type": "openai",
"temperature": 0.5
},
context_window_limit=64000
)
TypeScript
// Change model only
await client.agents.update(agent.id, {
model: "anthropic/claude-sonnet-4-5-20250929",
});
// Change model and settings
await client.agents.update(agent.id, {
model: "openai/gpt-4o",
model_settings: {
provider_type: "openai",
temperature: 0.5,
},
context_window_limit: 64000,
});
Note: Agents retain memory and tools when changing models.
Provider-Specific Settings
For OpenAI reasoning models and Anthropic extended thinking, see references/provider-settings.md.
Custom Endpoints
For OpenAI-compatible endpoints (vLLM, LM Studio, LocalAI), see references/custom-endpoints.md.
Embedding Models
Required for self-hosted deployments (Letta Cloud handles automatically):
agent = client.agents.create(
model="openai/gpt-4o",
embedding="openai/text-embedding-3-small"
)
Common embedding models:
openai/text-embedding-3-small(recommended)openai/text-embedding-3-largeopenai/text-embedding-ada-002
Anti-Hallucination Checklist
Before configuring models, verify:
- Model handle uses correct
provider/model-nameformat -
model_settingsincludes requiredprovider_typefield -
context_window_limitis set at agent level, not inmodel_settings - Provider-specific settings use correct nested structure (see references)
- For self-hosted: embedding model is specified
- Temperature is within valid range (0.0-2.0)
Example Scripts
See scripts/ for runnable examples:
scripts/basic_config.py- Basic model configurationscripts/basic_config.ts- TypeScript equivalentscripts/change_model.py- Changing models on existing agentsscripts/provider_specific.py- OpenAI reasoning, Anthropic thinking