agent-observability
SKILL.md
Agent Observability
Monitor AI agent behavior with logs, traces, metrics, and cost telemetry.
Track Core Signals
- Request latency (p50/p95/p99)
- Token usage (prompt/completion/cached)
- Tool call success and failure rates
- Cost per task and per customer
- Hallucination and retry frequency
Implementation Pattern
- Add trace IDs to every user request.
- Capture each LLM call and tool call as child spans.
- Emit structured logs with model, temperature, and response status.
- Create SLOs for success rate and median response time.
Best Practices
- Redact PII before exporting traces.
- Keep a replayable request envelope for incident review.
- Alert on abnormal token spikes and tool error bursts.
Related Skills
- alerting-oncall - Alert workflows
- agent-evals - Quality verification
Weekly Installs
9
Repository
bagelhole/devop…t-skillsGitHub Stars
13
First Seen
Feb 21, 2026
Security Audits
Installed on
cline9
github-copilot9
codex9
kimi-cli9
gemini-cli9
cursor9