skills/wesleysmits/agent-skills/implementing-error-handling

implementing-error-handling

SKILL.md

Error Handling Patterns

Build resilient applications with robust error handling strategies that gracefully handle failures and provide excellent debugging experiences.

When to Use This Skill

  • Implementing error handling in new features
  • Designing error-resilient APIs
  • Debugging production issues
  • Improving application reliability
  • Creating better error messages for users and developers
  • Implementing retry and circuit breaker patterns
  • Handling async/concurrent errors
  • Building fault-tolerant distributed systems

Core Concepts

1. Error Handling Philosophies

Exceptions vs Result Types:

  • Exceptions: Traditional try-catch, disrupts control flow
  • Result Types: Explicit success/failure, functional approach
  • Error Codes: C-style, requires discipline
  • Option/Maybe Types: For nullable values

When to Use Each:

  • Exceptions: Unexpected errors, exceptional conditions
  • Result Types: Expected errors, validation failures
  • Panics/Crashes: Unrecoverable errors, programming bugs

2. Error Categories

Recoverable Errors:

  • Network timeouts
  • Missing files
  • Invalid user input
  • API rate limits

Unrecoverable Errors:

  • Out of memory
  • Stack overflow
  • Programming bugs (null pointer, etc.)

Language-Specific Patterns

For detailed code examples in Python, TypeScript, Rust, and Go, see: šŸ‘‰ examples/language-patterns.md

Universal Patterns

Pattern 1: Circuit Breaker

Prevent cascading failures in distributed systems.

from enum import Enum
from datetime import datetime, timedelta
from typing import Callable, TypeVar

T = TypeVar('T')

class CircuitState(Enum):
    CLOSED = "closed"       # Normal operation
    OPEN = "open"          # Failing, reject requests
    HALF_OPEN = "half_open"  # Testing if recovered

class CircuitBreaker:
    def __init__(
        self,
        failure_threshold: int = 5,
        timeout: timedelta = timedelta(seconds=60),
        success_threshold: int = 2
    ):
        self.failure_threshold = failure_threshold
        self.timeout = timeout
        self.success_threshold = success_threshold
        self.failure_count = 0
        self.success_count = 0
        self.state = CircuitState.CLOSED
        self.last_failure_time = None

    def call(self, func: Callable[[], T]) -> T:
        if self.state == CircuitState.OPEN:
            if datetime.now() - self.last_failure_time > self.timeout:
                self.state = CircuitState.HALF_OPEN
                self.success_count = 0
            else:
                raise Exception("Circuit breaker is OPEN")

        try:
            result = func()
            self.on_success()
            return result
        except Exception as e:
            self.on_failure()
            raise

    def on_success(self):
        self.failure_count = 0
        if self.state == CircuitState.HALF_OPEN:
            self.success_count += 1
            if self.success_count >= self.success_threshold:
                self.state = CircuitState.CLOSED
                self.success_count = 0

    def on_failure(self):
        self.failure_count += 1
        self.last_failure_time = datetime.now()
        if self.failure_count >= self.failure_threshold:
            self.state = CircuitState.OPEN

# Usage
circuit_breaker = CircuitBreaker()

def fetch_data():
    return circuit_breaker.call(lambda: external_api.get_data())

Pattern 2: Error Aggregation

Collect multiple errors instead of failing on first error.

class ErrorCollector {
  private errors: Error[] = [];

  add(error: Error): void {
    this.errors.push(error);
  }

  hasErrors(): boolean {
    return this.errors.length > 0;
  }

  getErrors(): Error[] {
    return [...this.errors];
  }

  throw(): never {
    if (this.errors.length === 1) {
      throw this.errors[0];
    }
    throw new AggregateError(
      this.errors,
      `${this.errors.length} errors occurred`,
    );
  }
}

// Usage: Validate multiple fields
function validateUser(data: any): User {
  const errors = new ErrorCollector();

  if (!data.email) {
    errors.add(new ValidationError("Email is required"));
  } else if (!isValidEmail(data.email)) {
    errors.add(new ValidationError("Email is invalid"));
  }

  if (!data.name || data.name.length < 2) {
    errors.add(new ValidationError("Name must be at least 2 characters"));
  }

  if (!data.age || data.age < 18) {
    errors.add(new ValidationError("Age must be 18 or older"));
  }

  if (errors.hasErrors()) {
    errors.throw();
  }

  return data as User;
}

Pattern 3: Graceful Degradation

Provide fallback functionality when errors occur.

from typing import Optional, Callable, TypeVar

T = TypeVar('T')

def with_fallback(
    primary: Callable[[], T],
    fallback: Callable[[], T],
    log_error: bool = True
) -> T:
    """Try primary function, fall back to fallback on error."""
    try:
        return primary()
    except Exception as e:
        if log_error:
            logger.error(f"Primary function failed: {e}")
        return fallback()

# Usage
def get_user_profile(user_id: str) -> UserProfile:
    return with_fallback(
        primary=lambda: fetch_from_cache(user_id),
        fallback=lambda: fetch_from_database(user_id)
    )

# Multiple fallbacks
def get_exchange_rate(currency: str) -> float:
    return (
        try_function(lambda: api_provider_1.get_rate(currency))
        or try_function(lambda: api_provider_2.get_rate(currency))
        or try_function(lambda: cache.get_rate(currency))
        or DEFAULT_RATE
    )

def try_function(func: Callable[[], Optional[T]]) -> Optional[T]:
    try:
        return func()
    except Exception:
        return None

Best Practices

  1. Fail Fast: Validate input early, fail quickly
  2. Preserve Context: Include stack traces, metadata, timestamps
  3. Meaningful Messages: Explain what happened and how to fix it
  4. Log Appropriately: Error = log, expected failure = don't spam logs
  5. Handle at Right Level: Catch where you can meaningfully handle
  6. Clean Up Resources: Use try-finally, context managers, defer
  7. Don't Swallow Errors: Log or re-throw, don't silently ignore
  8. Type-Safe Errors: Use typed errors when possible
# Good error handling example
def process_order(order_id: str) -> Order:
    """Process order with comprehensive error handling."""
    try:
        # Validate input
        if not order_id:
            raise ValidationError("Order ID is required")

        # Fetch order
        order = db.get_order(order_id)
        if not order:
            raise NotFoundError("Order", order_id)

        # Process payment
        try:
            payment_result = payment_service.charge(order.total)
        except PaymentServiceError as e:
            # Log and wrap external service error
            logger.error(f"Payment failed for order {order_id}: {e}")
            raise ExternalServiceError(
                f"Payment processing failed",
                service="payment_service",
                details={"order_id": order_id, "amount": order.total}
            ) from e

        # Update order
        order.status = "completed"
        order.payment_id = payment_result.id
        db.save(order)

        return order

    except ApplicationError:
        # Re-raise known application errors
        raise
    except Exception as e:
        # Log unexpected errors
        logger.exception(f"Unexpected error processing order {order_id}")
        raise ApplicationError(
            "Order processing failed",
            code="INTERNAL_ERROR"
        ) from e

Common Pitfalls

  • Catching Too Broadly: except Exception hides bugs
  • Empty Catch Blocks: Silently swallowing errors
  • Logging and Re-throwing: Creates duplicate log entries
  • Not Cleaning Up: Forgetting to close files, connections
  • Poor Error Messages: "Error occurred" is not helpful
  • Returning Error Codes: Use exceptions or Result types
  • Ignoring Async Errors: Unhandled promise rejections

Resources

  • references/exception-hierarchy-design.md: Designing error class hierarchies
  • references/error-recovery-strategies.md: Recovery patterns for different scenarios
  • references/async-error-handling.md: Handling errors in concurrent code
  • assets/error-handling-checklist.md: Review checklist for error handling
  • assets/error-message-guide.md: Writing helpful error messages
  • scripts/error-analyzer.py: Analyze error patterns in logs
Weekly Installs
3
GitHub Stars
2
First Seen
Jan 24, 2026
Installed on
opencode3
gemini-cli3
codex3
cursor3
codebuddy2
claude-code2