m13-domain-error
SKILL.md
Domain Error Strategy
Layer 2: Design Choices
Core Question
Who needs to handle this error, and how should they recover?
Before designing error types:
- Is this user-facing or internal?
- Is recovery possible?
- What context is needed for debugging?
Error Categorization
| Error Type | Audience | Recovery | Example |
|---|---|---|---|
| User-facing | End users | Guide action | InvalidEmail, NotFound |
| Internal | Developers | Debug info | DatabaseError, ParseError |
| System | Ops/SRE | Monitor/alert | ConnectionTimeout, RateLimited |
| Transient | Automation | Retry | NetworkError, ServiceUnavailable |
| Permanent | Human | Investigate | ConfigInvalid, DataCorrupted |
Thinking Prompt
Before designing error types:
-
Who sees this error?
- End user → friendly message, actionable
- Developer → detailed, debuggable
- Ops → structured, alertable
-
Can we recover?
- Transient → retry with backoff
- Degradable → fallback value
- Permanent → fail fast, alert
-
What context is needed?
- Call chain → anyhow::Context
- Request ID → structured logging
- Input data → error payload
Trace Up ↑
To domain constraints (Layer 3):
"How should I handle payment failures?"
↑ Ask: What are the business rules for retries?
↑ Check: domain-fintech (transaction requirements)
↑ Check: SLA (availability requirements)
| Question | Trace To | Ask |
|---|---|---|
| Retry policy | domain-* | What's acceptable latency for retry? |
| User experience | domain-* | What message should users see? |
| Compliance | domain-* | What must be logged for audit? |
Trace Down ↓
To implementation (Layer 1):
"Need typed errors"
↓ m06-error-handling: thiserror for library
↓ m04-zero-cost: Error enum design
"Need error context"
↓ m06-error-handling: anyhow::Context
↓ Logging: tracing with fields
"Need retry logic"
↓ m07-concurrency: async retry patterns
↓ Crates: tokio-retry, backoff
Quick Reference
| Recovery Pattern | When | Implementation |
|---|---|---|
| Retry | Transient failures | exponential backoff |
| Fallback | Degraded mode | cached/default value |
| Circuit Breaker | Cascading failures | failsafe-rs |
| Timeout | Slow operations | tokio::time::timeout |
| Bulkhead | Isolation | separate thread pools |
Error Hierarchy
#[derive(thiserror::Error, Debug)]
pub enum AppError {
// User-facing
#[error("Invalid input: {0}")]
Validation(String),
// Transient (retryable)
#[error("Service temporarily unavailable")]
ServiceUnavailable(#[source] reqwest::Error),
// Internal (log details, show generic)
#[error("Internal error")]
Internal(#[source] anyhow::Error),
}
impl AppError {
pub fn is_retryable(&self) -> bool {
matches!(self, Self::ServiceUnavailable(_))
}
}
Retry Pattern
use tokio_retry::{Retry, strategy::ExponentialBackoff};
async fn with_retry<F, T, E>(f: F) -> Result<T, E>
where
F: Fn() -> impl Future<Output = Result<T, E>>,
E: std::fmt::Debug,
{
let strategy = ExponentialBackoff::from_millis(100)
.max_delay(Duration::from_secs(10))
.take(5);
Retry::spawn(strategy, || f()).await
}
Common Mistakes
| Mistake | Why Wrong | Better |
|---|---|---|
| Same error for all | No actionability | Categorize by audience |
| Retry everything | Wasted resources | Only transient errors |
| Infinite retry | DoS self | Max attempts + backoff |
| Expose internal errors | Security risk | User-friendly messages |
| No context | Hard to debug | .context() everywhere |
Anti-Patterns
| Anti-Pattern | Why Bad | Better |
|---|---|---|
| String errors | No structure | thiserror types |
| panic! for recoverable | Bad UX | Result with context |
| Ignore errors | Silent failures | Log or propagate |
| Box everywhere | Lost type info | thiserror |
| Error in happy path | Performance | Early validation |
Related Skills
| When | See |
|---|---|
| Error handling basics | m06-error-handling |
| Retry implementation | m07-concurrency |
| Domain modeling | m09-domain |
| User-facing APIs | domain-* |
Weekly Installs
36
Repository
zhanghandong/rust-skillsInstalled on
opencode29
claude-code28
gemini-cli26
codex23
antigravity20
cursor15