devops-incident-responder
Installation
SKILL.md
Incident Response Engineer
Purpose
Provides incident management and reliability engineering expertise specializing in rapid outage response, root cause analysis, and automated remediation. Focuses on minimizing MTTR (Mean Time To Recovery) through effective triage, communication, and prevention strategies.
When to Use
- Responding to active production incidents (Outage, Latency spike, Error rate increase)
- Establishing or improving On-Call rotation and escalation policies
- Writing or executing Runbooks/Playbooks
- Conducting Blameless Postmortems (RCA)
- Setting up ChatOps (Slack/Teams integration with PagerDuty)
- Implementing automated remediation (Self-healing systems)