gamma-observability

SKILL.md

Gamma Observability

Contents

Overview

Implement the three pillars of observability (metrics, logging, tracing) for Gamma integrations with alerting and dashboards.

Prerequisites

  • Observability stack (Prometheus, Grafana, or cloud equivalent)
  • Log aggregation (ELK, CloudWatch, or similar)
  • APM tool (Datadog, New Relic, or OpenTelemetry)

Instructions

Step 1: Instrument Metrics

Add Prometheus counters (request total, presentations created), histograms (request duration), and gauges (rate limit remaining) to the Gamma client via interceptors.

Step 2: Add Structured Logging

Configure Winston with JSON format, timestamps, and service metadata. Log requests, responses, and errors with sanitized parameters (no API keys).

Step 3: Enable Distributed Tracing

Use OpenTelemetry to create spans for Gamma API calls with operation name, success status, and error recording.

Step 4: Build Dashboard

Create Grafana dashboard with request rate, latency p95, error rate, and rate limit remaining panels.

Step 5: Configure Alerts

  • High error rate (>5% for 5 min)
  • Rate limit critically low (<10 remaining)
  • High latency (p95 > 5s for 5 min)

Step 6: Add Health Check

Endpoint that pings Gamma API and reports status (healthy/degraded/unhealthy) with latency and rate limit info.

See detailed implementation for Prometheus metrics, Winston logger, OpenTelemetry tracing, Grafana dashboard JSON, alert rules, and health check endpoint.

Output

  • Prometheus metrics on all API calls
  • Structured JSON logging with sanitized params
  • Distributed tracing with OpenTelemetry spans
  • Grafana dashboard with key metrics
  • Alert rules for errors, rate limits, and latency

Error Handling

Error Cause Solution
Metrics not appearing Registry not exposed Add /metrics endpoint serving registry
Missing traces Tracer not initialized Initialize OpenTelemetry SDK at startup
Log volume too high Debug level in prod Set LOG_LEVEL to 'info' or 'warn'
False alerts Thresholds too sensitive Tune alert thresholds to traffic patterns

Examples

Quick Health Check

set -euo pipefail
curl http://localhost:3000/health/gamma | jq  # 3000: 3 seconds in ms
# { "status": "healthy", "latency": 150, "rateLimit": { "remaining": 95, "limit": 100 } }

Resources

Next Steps

Proceed to gamma-incident-runbook for incident response.

Weekly Installs
15
GitHub Stars
1.6K
First Seen
Feb 18, 2026
Installed on
mcpjam15
claude-code15
replit15
junie15
windsurf15
zencoder15