accelerate
MANDATORY PREPARATION
Invoke /agent-workflow — it contains workflow principles, anti-patterns, and the Context Gathering Protocol. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run /teach-maestro first. Consult the context-management reference in the agent-workflow skill for window optimization and budget strategies.
Make the workflow faster and cheaper without sacrificing quality. Measure before and after.
Performance Audit
Measure current performance:
Current metrics:
Latency (p50): ___ms
Latency (p95): ___ms
Cost per request: $___
Token usage (avg): ___ input / ___ output
Error rate: ___%
Acceleration Strategies
Reduce Token Usage
- Shorten system prompts (remove redundant instructions)
- Compress few-shot examples to minimum viable length
- Use structured output schemas instead of verbose text
- Summarize context instead of passing raw documents
- Reduce output length requirements
Model Cascading
- Route simple tasks to cheaper/faster models
- Escalate only complex tasks to capable models
- Use classification to determine complexity
Caching
- Cache responses for identical or near-identical inputs
- Cache tool results with appropriate TTL
- Cache embeddings for frequently-queried documents
- Use semantic caching for similar (not identical) queries
Parallelization
- Run independent tool calls in parallel
- Run independent agent steps in parallel
- Use streaming to start processing before full response
Context Optimization
- Retrieve less, retrieve better (improve retrieval precision)
- Use context compression techniques
- Implement sliding window for long conversations
Acceleration Report
For each optimization:
- What changed: Specific modification
- Before: Latency/cost/tokens before
- After: Latency/cost/tokens after
- Quality impact: Any quality change (verify with golden tests)
- Trade-off: What was sacrificed for the improvement
Acceleration Checklist
- Baseline metrics recorded before any changes
- Each optimization measured with before/after comparison
- Quality impact verified (golden tests still pass)
- Trade-offs documented for each change
- Cost/latency improvements quantified
Recommended Next Step
After optimization, run /evaluate to verify quality didn't degrade, or /iterate to set up continuous monitoring.
NEVER:
- Optimize without measuring first (you need a baseline)
- Sacrifice quality for speed without explicit user approval
- Cache outputs that depend on real-time data
- Skip the quality check after optimization
- Optimize prematurely (make it correct first, then make it fast)
More from sharpdeveye/maestro
agent-workflow
Use when any Maestro command is invoked — provides foundational workflow design principles across prompt engineering, context management, tool orchestration, agent architecture, feedback loops, knowledge systems, and guardrails.
133diagnose
Use when the user wants to find problems, audit workflow quality, or get a comprehensive health check on their AI workflow.
131evaluate
Use when the user wants a quality review, interaction audit, or to test the workflow against realistic scenarios.
130calibrate
Use when workflow components are inconsistent, naming conventions vary, or a new team member's work needs alignment to project standards.
125fortify
Use when the workflow lacks error handling, has been failing in production, or needs retry logic, fallback strategies, and circuit breakers.
125streamline
Use when the workflow feels too complex, has accumulated cruft, or has redundant steps and overlapping tools that need consolidation.
125