moai-domain-monitoring
SKILL.md
Monitoring & Observability Expert
Production Monitoring Stack
Focus: Metrics, Logs, Traces (Three Pillars of Observability)
Stack: Prometheus, Grafana, Loki, OpenTelemetry, Jaeger
Overview
Complete observability for production systems.
Three Pillars
- Metrics: Time-series data (Prometheus)
- Logs: Event records (Loki, ELK)
- Traces: Distributed request tracking (Jaeger, Tempo)
Quick Start
1. Metrics (Prometheus)
Counter, Gauge, Histogram for application metrics.
Key Metrics:
- Request rate (requests/sec)
- Error rate (5xx/total)
- Latency (p50, p95, p99)
See: examples.md
2. Logging (Structured)
JSON-formatted logs for easy parsing.
Fields: timestamp, level, message, context
See: examples.md
3. Tracing (OpenTelemetry)
Track requests across microservices.
Concepts: Span, Trace ID, Parent-Child relationships
See: examples.md
4. Alerting
Automated alerts on threshold breaches.
Examples: Error rate >5%, CPU >80%
See: examples.md
Monitoring Methodologies
RED Method (for Services)
- Rate: Requests per second
- Errors: Error percentage
- Duration: Latency distribution
USE Method (for Resources)
- Utilization: % busy
- Saturation: Queue depth
- Errors: Error count
Best Practices
- Cardinality: Avoid high-cardinality labels (e.g., user_id)
- Retention: 15 days (metrics), 7 days (logs)
- Sampling: 1% trace sampling for high-traffic services
- Dashboards: One dashboard per service
Validation Checklist
- Metrics: Prometheus scraping configured?
- Logs: Structured (JSON) format?
- Traces: OpenTelemetry instrumented?
- Alerts: Critical alerts defined?
- Dashboards: Grafana dashboards created?
Related Skills
moai-essentials-perf: Performance profilingmoai-devops-docker: Container monitoringmoai-cloud-aws-advanced: CloudWatch
Additional Resources
- examples.md: Implementation code
- reference.md: Prometheus query language (PromQL)
Last Updated: 2025-11-20
Weekly Installs
1
Repository
jg-chalk-io/nora-livekitFirst Seen
Mar 2, 2026
Security Audits
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1