implementing-error-handling
SKILL.md
Error Handling Patterns
Build resilient applications with robust error handling strategies that gracefully handle failures and provide excellent debugging experiences.
When to Use This Skill
- Implementing error handling in new features
- Designing error-resilient APIs
- Debugging production issues
- Improving application reliability
- Creating better error messages for users and developers
- Implementing retry and circuit breaker patterns
- Handling async/concurrent errors
- Building fault-tolerant distributed systems
Core Concepts
1. Error Handling Philosophies
Exceptions vs Result Types:
- Exceptions: Traditional try-catch, disrupts control flow
- Result Types: Explicit success/failure, functional approach
- Error Codes: C-style, requires discipline
- Option/Maybe Types: For nullable values
When to Use Each:
- Exceptions: Unexpected errors, exceptional conditions
- Result Types: Expected errors, validation failures
- Panics/Crashes: Unrecoverable errors, programming bugs
2. Error Categories
Recoverable Errors:
- Network timeouts
- Missing files
- Invalid user input
- API rate limits
Unrecoverable Errors:
- Out of memory
- Stack overflow
- Programming bugs (null pointer, etc.)
Language-Specific Patterns
For detailed code examples in Python, TypeScript, Rust, and Go, see: š examples/language-patterns.md
Universal Patterns
Pattern 1: Circuit Breaker
Prevent cascading failures in distributed systems.
from enum import Enum
from datetime import datetime, timedelta
from typing import Callable, TypeVar
T = TypeVar('T')
class CircuitState(Enum):
CLOSED = "closed" # Normal operation
OPEN = "open" # Failing, reject requests
HALF_OPEN = "half_open" # Testing if recovered
class CircuitBreaker:
def __init__(
self,
failure_threshold: int = 5,
timeout: timedelta = timedelta(seconds=60),
success_threshold: int = 2
):
self.failure_threshold = failure_threshold
self.timeout = timeout
self.success_threshold = success_threshold
self.failure_count = 0
self.success_count = 0
self.state = CircuitState.CLOSED
self.last_failure_time = None
def call(self, func: Callable[[], T]) -> T:
if self.state == CircuitState.OPEN:
if datetime.now() - self.last_failure_time > self.timeout:
self.state = CircuitState.HALF_OPEN
self.success_count = 0
else:
raise Exception("Circuit breaker is OPEN")
try:
result = func()
self.on_success()
return result
except Exception as e:
self.on_failure()
raise
def on_success(self):
self.failure_count = 0
if self.state == CircuitState.HALF_OPEN:
self.success_count += 1
if self.success_count >= self.success_threshold:
self.state = CircuitState.CLOSED
self.success_count = 0
def on_failure(self):
self.failure_count += 1
self.last_failure_time = datetime.now()
if self.failure_count >= self.failure_threshold:
self.state = CircuitState.OPEN
# Usage
circuit_breaker = CircuitBreaker()
def fetch_data():
return circuit_breaker.call(lambda: external_api.get_data())
Pattern 2: Error Aggregation
Collect multiple errors instead of failing on first error.
class ErrorCollector {
private errors: Error[] = [];
add(error: Error): void {
this.errors.push(error);
}
hasErrors(): boolean {
return this.errors.length > 0;
}
getErrors(): Error[] {
return [...this.errors];
}
throw(): never {
if (this.errors.length === 1) {
throw this.errors[0];
}
throw new AggregateError(
this.errors,
`${this.errors.length} errors occurred`,
);
}
}
// Usage: Validate multiple fields
function validateUser(data: any): User {
const errors = new ErrorCollector();
if (!data.email) {
errors.add(new ValidationError("Email is required"));
} else if (!isValidEmail(data.email)) {
errors.add(new ValidationError("Email is invalid"));
}
if (!data.name || data.name.length < 2) {
errors.add(new ValidationError("Name must be at least 2 characters"));
}
if (!data.age || data.age < 18) {
errors.add(new ValidationError("Age must be 18 or older"));
}
if (errors.hasErrors()) {
errors.throw();
}
return data as User;
}
Pattern 3: Graceful Degradation
Provide fallback functionality when errors occur.
from typing import Optional, Callable, TypeVar
T = TypeVar('T')
def with_fallback(
primary: Callable[[], T],
fallback: Callable[[], T],
log_error: bool = True
) -> T:
"""Try primary function, fall back to fallback on error."""
try:
return primary()
except Exception as e:
if log_error:
logger.error(f"Primary function failed: {e}")
return fallback()
# Usage
def get_user_profile(user_id: str) -> UserProfile:
return with_fallback(
primary=lambda: fetch_from_cache(user_id),
fallback=lambda: fetch_from_database(user_id)
)
# Multiple fallbacks
def get_exchange_rate(currency: str) -> float:
return (
try_function(lambda: api_provider_1.get_rate(currency))
or try_function(lambda: api_provider_2.get_rate(currency))
or try_function(lambda: cache.get_rate(currency))
or DEFAULT_RATE
)
def try_function(func: Callable[[], Optional[T]]) -> Optional[T]:
try:
return func()
except Exception:
return None
Best Practices
- Fail Fast: Validate input early, fail quickly
- Preserve Context: Include stack traces, metadata, timestamps
- Meaningful Messages: Explain what happened and how to fix it
- Log Appropriately: Error = log, expected failure = don't spam logs
- Handle at Right Level: Catch where you can meaningfully handle
- Clean Up Resources: Use try-finally, context managers, defer
- Don't Swallow Errors: Log or re-throw, don't silently ignore
- Type-Safe Errors: Use typed errors when possible
# Good error handling example
def process_order(order_id: str) -> Order:
"""Process order with comprehensive error handling."""
try:
# Validate input
if not order_id:
raise ValidationError("Order ID is required")
# Fetch order
order = db.get_order(order_id)
if not order:
raise NotFoundError("Order", order_id)
# Process payment
try:
payment_result = payment_service.charge(order.total)
except PaymentServiceError as e:
# Log and wrap external service error
logger.error(f"Payment failed for order {order_id}: {e}")
raise ExternalServiceError(
f"Payment processing failed",
service="payment_service",
details={"order_id": order_id, "amount": order.total}
) from e
# Update order
order.status = "completed"
order.payment_id = payment_result.id
db.save(order)
return order
except ApplicationError:
# Re-raise known application errors
raise
except Exception as e:
# Log unexpected errors
logger.exception(f"Unexpected error processing order {order_id}")
raise ApplicationError(
"Order processing failed",
code="INTERNAL_ERROR"
) from e
Common Pitfalls
- Catching Too Broadly:
except Exceptionhides bugs - Empty Catch Blocks: Silently swallowing errors
- Logging and Re-throwing: Creates duplicate log entries
- Not Cleaning Up: Forgetting to close files, connections
- Poor Error Messages: "Error occurred" is not helpful
- Returning Error Codes: Use exceptions or Result types
- Ignoring Async Errors: Unhandled promise rejections
Resources
- references/exception-hierarchy-design.md: Designing error class hierarchies
- references/error-recovery-strategies.md: Recovery patterns for different scenarios
- references/async-error-handling.md: Handling errors in concurrent code
- assets/error-handling-checklist.md: Review checklist for error handling
- assets/error-message-guide.md: Writing helpful error messages
- scripts/error-analyzer.py: Analyze error patterns in logs
Weekly Installs
3
Repository
wesleysmits/agent-skillsGitHub Stars
2
First Seen
Jan 24, 2026
Security Audits
Installed on
opencode3
gemini-cli3
codex3
cursor3
codebuddy2
claude-code2