reflect
MANDATORY PREPARATION
Invoke /agent-workflow — it contains workflow principles, anti-patterns, and the Context Gathering Protocol. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run /teach-maestro first.
Analyze the Maestro audit trail and decision log to produce a skill-effectiveness scorecard. This tells you which commands work, which fail, and where your workflow needs attention.
Data Sources
Read these files from the project root:
.maestro/audit.jsonl— every command invocation with duration, cost, and outcome.maestro/decisions.jsonl— decisions made with outcomes and next steps
If neither file exists, respond: "No audit data found. Run commands with Maestro to start tracking, then come back."
Analysis Dimensions
1. Usage Frequency
- Which commands run most/least?
- Are any commands never used? (candidates for removal)
2. Completion Rate
- What % of invocations complete successfully?
- Which commands fail most often?
3. Command Flow
- What are the most common command sequences (A → B)?
- Which commands lead to follow-ups vs. abandonment?
- Abandonment rate per command (no follow-up within 30 min)
4. Cost Distribution
- Total estimated cost across all commands
- Cost per command (average)
- Most/least expensive commands
5. Duration Analysis
- Average duration per command
- Outliers (unusually slow invocations)
Output Format
╔══════════════════════════════════════════╗
║ MAESTRO EFFECTIVENESS ║
╠══════════════════════════════════════════╣
║ Commands Run __ (__ unique) ║
║ Completion Rate __% ║
║ Most Used /_____ (__×) ║
║ Most Abandoned /_____ (__% ⚠️) ║
║ Avg Duration __s ║
║ Total Cost ~$__.__ ║
╠══════════════════════════════════════════╣
║ STRONGEST PIPELINES ║
╠══════════════════════════════════════════╣
║ /_____ → /_____ __× ║
║ /_____ → /_____ __× ║
╠══════════════════════════════════════════╣
║ COST PER COMMAND ║
╠══════════════════════════════════════════╣
║ /_____ $__.__/run ████░░ avg ║
║ /_____ $__.__/run █░░░░░ cheap ║
║ /_____ $__.__/run █████░ costly ║
╚══════════════════════════════════════════╝
INSIGHTS:
1. [Data-driven observation with recommended action]
2. [Data-driven observation with recommended action]
3. [Data-driven observation with recommended action]
Insights Rules
Every insight MUST:
- Reference specific data (e.g., "40% abandonment rate")
- Suggest a specific Maestro command to address it
- Distinguish correlation from causation
Reflection Checklist
- All 5 analysis dimensions covered
- Scorecard generated with real data
- Insights are data-driven, not speculative
- Cost estimates labeled as approximate (~)
- Recommended actions reference specific Maestro commands
Recommended Next Step
After reflecting, run /streamline to remove unused commands, or /refine on the most-abandoned command to improve its prompt quality.
NEVER:
- Require audit data to exist — degrade gracefully
- Invent metrics beyond what the logs contain
- Show cost data without the "estimate" disclaimer (~)
- Make judgments without evidence (say "100% completion rate" not "works great")
- Compare across projects — reflect is project-scoped
More from sharpdeveye/maestro
agent-workflow
Use when any Maestro command is invoked — provides foundational workflow design principles across prompt engineering, context management, tool orchestration, agent architecture, feedback loops, knowledge systems, and guardrails.
133diagnose
Use when the user wants to find problems, audit workflow quality, or get a comprehensive health check on their AI workflow.
131evaluate
Use when the user wants a quality review, interaction audit, or to test the workflow against realistic scenarios.
130calibrate
Use when workflow components are inconsistent, naming conventions vary, or a new team member's work needs alignment to project standards.
125fortify
Use when the workflow lacks error handling, has been failing in production, or needs retry logic, fallback strategies, and circuit breakers.
125streamline
Use when the workflow feels too complex, has accumulated cruft, or has redundant steps and overlapping tools that need consolidation.
125