Observability Checklist
Observability Checklist
What this skill does
This skill reviews a service or codebase against a comprehensive observability checklist covering structured logging, metrics instrumentation, distributed tracing, alerting, dashboards, and runbooks. It identifies gaps that would make it hard to diagnose incidents, detect regressions, or understand the system's health. The output is a prioritized list of missing observability with recommendations for each gap.
Use this when building a new service, when preparing for an on-call rotation, after an incident where you couldn't figure out what happened, or as part of a production readiness review.
How to use
Claude Code / Cline
Copy this file to .agents/skills/observability-checklist/SKILL.md in your project root.
Then ask:
- "Use the Observability Checklist skill to review our payments service."
- "Run an observability review on
server/routes/orders.tsusing the Observability Checklist skill."
Provide the service description, relevant code files, and information about what observability tooling is already in place (e.g., "we use Datadog, we have some logging but no tracing").