error-recoverer
SKILL.md
Error Recoverer
Detects, classifies, and recovers from errors during autonomous coding sessions.
Quick Start
Handle Error
from scripts.error_recoverer import ErrorRecoverer
recoverer = ErrorRecoverer(project_dir)
result = await recoverer.handle_error(error, context)
if result.recovered:
print(f"Recovered via: {result.strategy}")
else:
print(f"Failed: {result.reason}")
Automatic Recovery
@recoverer.with_recovery
async def risky_operation():
# Operation that might fail
pass
Error Recovery Workflow
┌─────────────────────────────────────────────────────────────┐
│ ERROR RECOVERY FLOW │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. DETECT │
│ ├─ Catch exception │
│ ├─ Parse error message │
│ └─ Extract error context │
│ │
│ 2. CLASSIFY │
│ ├─ Determine error category │
│ ├─ Assess severity level │
│ └─ Check if recoverable │
│ │
│ 3. STRATEGIZE │
│ ├─ Query causal memory for similar errors │
│ ├─ Select recovery strategy │
│ └─ Prepare recovery action │
│ │
│ 4. RECOVER │
│ ├─ Execute recovery strategy │
│ ├─ Verify recovery success │
│ └─ Store error→solution chain │
│ │
│ 5. ESCALATE (if recovery fails) │
│ ├─ Rollback to checkpoint │
│ ├─ Create detailed error report │
│ └─ Signal for human intervention │
│ │
└─────────────────────────────────────────────────────────────┘
Error Categories
| Category | Examples | Recovery Strategy |
|---|---|---|
| Transient | Network timeout, rate limit | Retry with backoff |
| Resource | File not found, permission denied | Fix path/permissions |
| Syntax | Parse error, invalid JSON | Fix syntax errors |
| Logic | Test failure, assertion error | Debug and fix code |
| Environment | Missing dependency, version mismatch | Install/update deps |
| Unrecoverable | Disk full, OOM | Escalate immediately |
Recovery Strategies
class RecoveryStrategy(Enum):
RETRY = "retry" # Simple retry
RETRY_BACKOFF = "backoff" # Exponential backoff
ROLLBACK = "rollback" # Restore checkpoint
FIX_AND_RETRY = "fix_retry" # Apply fix, then retry
SKIP = "skip" # Skip and continue
ESCALATE = "escalate" # Human intervention
Integration Points
- memory-manager: Query/store causal chains
- checkpoint-manager: Rollback on failure
- coding-agent: Provide fixes for code errors
- progress-tracker: Log error metrics
References
references/ERROR-CATEGORIES.md- Error classificationreferences/RECOVERY-STRATEGIES.md- Strategy details
Scripts
scripts/error_recoverer.py- Core recovery logicscripts/error_classifier.py- Error classificationscripts/retry_handler.py- Retry with backoffscripts/recovery_strategies.py- Strategy implementations