fortify
MANDATORY PREPARATION
Invoke /agent-workflow — it contains workflow principles, anti-patterns, and the Context Gathering Protocol. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run /teach-maestro first. Consult the guardrails-safety reference in the agent-workflow skill for defense-in-depth patterns and error boundary design.
Make the workflow resilient. Every external call will fail eventually — model APIs, tools, databases, third-party services. Fortify ensures the workflow handles failure gracefully.
Fortification Layers
Layer 1: Input Validation
- Validate all inputs before processing
- Return clear error messages for invalid input
- Set size limits on all input fields
Layer 2: Retry with Backoff For transient failures (network errors, rate limits, timeouts):
Retry strategy:
max_retries: 3
initial_delay: 1s
backoff_multiplier: 2
max_delay: 30s
retryable_errors: [429, 500, 502, 503, 504, TIMEOUT, CONNECTION_ERROR]
non_retryable_errors: [400, 401, 403, 404]
Layer 3: Fallback Responses When retries are exhausted:
- Use a cached previous response (if applicable)
- Use a simpler/cheaper model as fallback
- Return a graceful degradation response
- Escalate to human review
Layer 4: Circuit Breakers When a service is consistently failing:
Circuit breaker:
failure_threshold: 5 consecutive failures
state: CLOSED → OPEN (after threshold) → HALF_OPEN (after cooldown)
cooldown: 60 seconds
half_open_max_requests: 1
Layer 5: Timeout Controls Every external call needs a timeout:
- Model API calls: 30-120s depending on task
- Tool executions: 10-60s depending on tool
- Database queries: 5-15s
- Third-party APIs: 10-30s
Fortification Audit
For each component, verify:
- Input validation present
- Retry logic for transient failures
- Fallback for when retries fail
- Timeout set
- Error logged with context
- User gets a meaningful error (not a stack trace)
Recommended Next Step
After fortification, run /evaluate to verify error handling works under realistic failure scenarios.
NEVER:
- Retry non-retryable errors (authentication failures, validation errors)
- Retry without backoff (you'll make the problem worse)
- Swallow errors silently (log and handle, don't ignore)
- Set infinite timeouts (they'll hang forever)
- Skip the fallback (retries exhausted with no fallback = user sees an error)
More from sharpdeveye/maestro
agent-workflow
Use when any Maestro command is invoked — provides foundational workflow design principles across prompt engineering, context management, tool orchestration, agent architecture, feedback loops, knowledge systems, and guardrails.
133diagnose
Use when the user wants to find problems, audit workflow quality, or get a comprehensive health check on their AI workflow.
131evaluate
Use when the user wants a quality review, interaction audit, or to test the workflow against realistic scenarios.
130calibrate
Use when workflow components are inconsistent, naming conventions vary, or a new team member's work needs alignment to project standards.
125streamline
Use when the workflow feels too complex, has accumulated cruft, or has redundant steps and overlapping tools that need consolidation.
125teach-maestro
Use when starting a new project with Maestro or when no .maestro.md context file exists yet. Run once per project.
125