Exception Handling & Recovery
Exception Handling & Recovery
Exception Handling ensures that an agentic system degrades gracefully rather than crashing. In the nondeterministic world of LLMs, failures are common: models hallucinate, APIs time out, and outputs are malformed. This pattern wraps critical operations in "try/catch" blocks that trigger recovery agents or fallback strategies.
When to Use
- Production Systems: Essential for any user-facing application.
- Unreliable Tools: When using 3rd-party APIs that might be down or rate-limited.
- Structured Output: When the model occasionally fails to output valid JSON.
- Safety: When a tool might return dangerous or unexpected data.
Use Cases
- API Fallback: "Primary model API failed? Switch to backup model API." or "Tool A failed? Try Tool B."
- Refusal Handling: If the model refuses to answer (due to safety filters), catch the refusal and rephrase the prompt or explain why it can't answer.
- Validation Repair: If JSON validation fails, pass the error back to the model to fix the syntax.
Implementation Pattern
def resilient_tool_call(tool_name, args):
max_retries = 3
for attempt in range(max_retries):
try:
# Try to execute the tool
return execute_tool(tool_name, args)
except RateLimitError:
# Specific handling for known errors
backoff_sleep(attempt)
except ValidationError as e:
# Self-Correction: Ask the model to fix its input
print(f"Validation failed: {e}. Asking model to fix...")
args = repair_agent.fix_inputs(tool_name, args, error=e)
except Exception as e:
# General fallback
log_error(e)
return fallback_strategy(tool_name)
raise SystemError("Max retries exceeded")
More from lauraflorentin/skills-marketplace
multi-agent-collaboration
A structural pattern where multiple specialized agents communicate and coordinate to solve a problem that is too complex for a single agent. Use when user asks to "build a multi-agent system", "agents working together", "agent collaboration", or mentions team of agents, distributed agents, or swarm.
21reflection
A recursive pattern where an agent evaluates and critiques its own output to iteratively improve quality and catch errors. Use when user asks to "add self-reflection", "agent introspection", "self-critique", or mentions self-evaluation, meta-cognition, or quality self-assessment.
18human-in-the-loop
A hybrid pattern where the system pauses execution to request human approval, input, or disambiguation before proceeding with critical actions. Use when user asks to "add human approval", "require human review", "human-in-the-loop", or mentions approval workflows, human oversight, or escalation.
16parallelization
A concurrency pattern where multiple agent tasks are executed at the same time to speed up processing or gather diverse perspectives. Use when user asks to "run agents in parallel", "parallelize tasks", "concurrent execution", or mentions parallel processing, fan-out, or batch execution.
13adaptation
A dynamic pattern where an agent system modifies its own behavior, prompts, or tools over time based on feedback or performance metrics. Use when user asks to "make my agent adaptive", "add learning capabilities", "self-improving agent", or mentions adaptive behavior, online learning, or feedback loops.
12prioritization
A management pattern where an agent assesses the urgency and importance of incoming tasks to organize a dynamic execution queue. Use when user asks to "prioritize tasks", "rank agent actions", "task ordering", or mentions priority queues, urgency scoring, or triage.
11