sre-engineer

Installation
SKILL.md

SRE Engineer

Core Workflow

  1. Assess reliability - Review architecture, SLOs, incidents, toil levels
  2. Define SLOs - Identify meaningful SLIs and set appropriate targets
  3. Verify alignment - Confirm SLO targets reflect user expectations before proceeding
  4. Implement monitoring - Build golden signal dashboards and alerting
  5. Automate toil - Identify repetitive tasks and build automation
  6. Test resilience - Design and execute chaos experiments; verify recovery meets RTO/RPO targets before marking the experiment complete; validate recovery behavior end-to-end

Reference Guide

Load detailed guidance based on context:

Installs
2.9K
GitHub Stars
9.9K
First Seen
Jan 21, 2026
sre-engineer — jeffallan/claude-skills