ai-chat-studio
AI Chat Studio
Part of Agent Skills™ by googleadsagent.ai™
Description
AI Chat Studio provides a multi-LLM chat orchestration framework with 300+ assistant presets, intelligent model routing, and conversation management. The agent configures and manages interactions across multiple language model providers—OpenAI, Anthropic, Google, open-source models—selecting the optimal model for each task based on capability, cost, and latency requirements.
Not every task needs the most powerful model. A code review benefits from a reasoning-heavy model; a translation task runs well on a mid-tier model; a simple reformatting task wastes money on anything beyond a fast, cheap model. This skill implements intelligent routing that matches task characteristics to model capabilities, reducing cost by 40-60% while maintaining quality where it matters.
The 300+ assistant presets encode domain-specific system prompts, temperature settings, and output format constraints for common tasks: code generation, technical writing, data analysis, creative ideation, customer support, legal review, and more. Each preset is tested against a quality benchmark and tagged with the models it performs best on.
Use When
- Configuring multi-provider LLM access in an application
- Routing tasks to the optimal model by cost-quality trade-off
- Managing conversation history and context windows
- Deploying domain-specific AI assistants with curated presets
- Building chat interfaces with streaming responses
- Comparing model outputs for the same prompt across providers
How It Works
graph TD
A[User Message] --> B[Task Classifier]
B --> C{Task Type}
C -->|Complex Reasoning| D[Claude 4 / GPT-4o]
C -->|Code Generation| E[Claude 4 / Codestral]
C -->|Translation| F[GPT-4o-mini / Gemini Flash]
C -->|Simple Format| G[Haiku / Flash]
D --> H[Apply Preset: System Prompt + Params]
E --> H
F --> H
G --> H
H --> I[Manage Context Window]
I --> J[Stream Response]
J --> K[Log Usage + Cost]
The task classifier analyzes the incoming message to determine complexity and domain, then routes to the most cost-effective model capable of handling it. Presets provide domain-specific system prompts and parameter tuning.
Implementation
interface ModelConfig {
provider: "openai" | "anthropic" | "google" | "ollama";
model: string;
maxTokens: number;
costPer1kInput: number;
costPer1kOutput: number;
capabilities: string[];
}
const MODEL_REGISTRY: ModelConfig[] = [
{ provider: "anthropic", model: "claude-sonnet-4-20250514", maxTokens: 8192,
costPer1kInput: 0.003, costPer1kOutput: 0.015, capabilities: ["reasoning", "code", "analysis"] },
{ provider: "openai", model: "gpt-4o-mini", maxTokens: 4096,
costPer1kInput: 0.00015, costPer1kOutput: 0.0006, capabilities: ["general", "translation", "format"] },
{ provider: "google", model: "gemini-2.0-flash", maxTokens: 8192,
costPer1kInput: 0.0001, costPer1kOutput: 0.0004, capabilities: ["general", "fast", "multimodal"] },
];
interface AssistantPreset {
id: string;
name: string;
systemPrompt: string;
temperature: number;
preferredModels: string[];
tags: string[];
}
class ChatRouter {
constructor(private models: ModelConfig[], private presets: Map<string, AssistantPreset>) {}
route(message: string, presetId?: string): { model: ModelConfig; preset?: AssistantPreset } {
const preset = presetId ? this.presets.get(presetId) : undefined;
const taskType = this.classifyTask(message);
const candidates = this.models.filter(m =>
m.capabilities.some(c => taskType.requiredCapabilities.includes(c))
);
const selected = candidates.sort((a, b) => a.costPer1kInput - b.costPer1kInput)[0];
return { model: selected, preset };
}
private classifyTask(message: string): { type: string; requiredCapabilities: string[] } {
const lower = message.toLowerCase();
if (lower.includes("debug") || lower.includes("refactor") || lower.includes("architect"))
return { type: "complex", requiredCapabilities: ["reasoning", "code"] };
if (lower.includes("translate") || lower.includes("rewrite"))
return { type: "simple", requiredCapabilities: ["general", "translation"] };
return { type: "general", requiredCapabilities: ["general"] };
}
}
class ConversationManager {
private history: Array<{ role: string; content: string }> = [];
private maxContextTokens: number;
constructor(maxContextTokens: number = 100_000) {
this.maxContextTokens = maxContextTokens;
}
addMessage(role: string, content: string): void {
this.history.push({ role, content });
this.trimToContextWindow();
}
getHistory(): Array<{ role: string; content: string }> {
return [...this.history];
}
private trimToContextWindow(): void {
while (this.estimateTokens() > this.maxContextTokens && this.history.length > 2) {
this.history.splice(1, 1);
}
}
private estimateTokens(): number {
return this.history.reduce((sum, m) => sum + Math.ceil(m.content.length / 4), 0);
}
}
Best Practices
- Route simple tasks to cheaper models—80% of queries do not need frontier models
- Implement streaming responses for all chat interactions to improve perceived latency
- Trim conversation history from the middle, preserving the system prompt and recent messages
- Log model selection decisions alongside cost to optimize routing rules over time
- Test presets against a benchmark dataset before deploying to production
- Provide fallback models for every route in case the primary provider is unavailable
Platform Compatibility
| Platform | Support | Notes |
|---|---|---|
| Cursor | Full | Multi-model configuration |
| VS Code | Full | Extension-based LLM access |
| Windsurf | Full | Built-in model routing |
| Claude Code | Full | Multi-provider support |
| Cline | Full | Model selection config |
| aider | Full | Multiple model backends |
Related Skills
Keywords
ai-chat multi-llm model-routing assistant-presets conversation-management streaming cost-optimization chat-studio
© 2026 googleadsagent.ai™ | Agent Skills™ | MIT License