incident

Installation
SKILL.md

Incident Response Assistant | 事故回應助手

Language: English | 繁體中文

Guide structured incident response from detection through post-mortem.

引導結構化的事故回應流程,從偵測到事後檢討。

Severity Classification | 嚴重程度分類

Level Name Criteria 標準 Response Time
SEV-1 Critical 重大 Full service outage, data loss 全面服務中斷、資料遺失 Immediate (< 15 min)
SEV-2 High 高 Major feature degraded, partial outage 主要功能降級、部分中斷 < 30 min
SEV-3 Medium 中 Minor feature affected, workaround exists 次要功能受影響、有替代方案 < 4 hours
SEV-4 Low 低 Cosmetic issue, minimal user impact 外觀問題、最小用戶影響 Next business day

Response Workflow | 回應工作流程

DETECT ──► TRIAGE ──► MITIGATE ──► RESOLVE ──► POST-MORTEM ──► IMPROVE
偵測         分級        緩解          解決         事後檢討       持續改善

1. Detect — Identify the Incident | 偵測事故

  • Monitor alerts, user reports, error spikes
  • 監控告警、使用者回報、錯誤量飆升

2. Triage — Classify Severity | 分級嚴重程度

  • Assign SEV level, identify Incident Commander (IC)
  • 指定嚴重等級、識別事故指揮官(IC)

3. Mitigate — Reduce Impact | 緩解影響

  • Apply temporary fix: rollback, feature flag, traffic shift
  • 套用暫時修復:回滾、功能開關、流量切換

4. Resolve — Permanent Fix | 永久修復

  • Root cause analysis, implement proper fix, deploy
  • 根因分析、實作正確修復、部署

5. Post-Mortem — Document & Analyze | 事後檢討

  • Document timeline, impact, root cause, action items
  • 記錄時間軸、影響範圍、根因、行動項目

6. Improve — Track & Prevent | 持續改善

  • Track action item completion, analyze incident trends, prevent recurrence
  • 追蹤行動項目完成度、分析事故趨勢、防止再發

Post-Mortem Template | 事後檢討模板

## Post-Mortem: [Incident Title]
**Date**: YYYY-MM-DD  |  **Severity**: SEV-N  |  **Duration**: Xh Ym

### Timeline
| Time | Event |
|------|-------|
| HH:MM | Alert triggered / 告警觸發 |
| HH:MM | IC assigned / 指派事故指揮官 |
| HH:MM | Mitigation applied / 套用緩解措施 |
| HH:MM | Resolved / 解決 |

### Impact
- Users affected / 受影響用戶數: N
- Revenue impact / 營收影響: $N
- SLA breach / SLA 違反: Yes/No

### Root Cause
[Description of the underlying cause / 根本原因描述]

### Action Items
| Action | Owner | Due Date | Priority | Status |
|--------|-------|----------|----------|--------|
| [Fix] | @name | YYYY-MM-DD | P0 | Open |

### Incident Metrics
- MTTR (Mean Time To Recovery): Xh Ym
- Detection Time: Xm (alert to IC assigned)
- Recurrence: First / Repeat (link to previous)

Communication Template | 溝通模板

[SEV-N] [Service Name] — [Brief Description]
Status: Investigating / Mitigating / Resolved
Impact: [Who is affected and how]
Next update: [Time]

Usage | 使用方式

/incident                       # Show full incident response guide | 顯示完整事故回應指南
/incident "API 500 errors"      # Guided response for specific incident | 特定事故引導回應
/incident --post-mortem         # Generate post-mortem template | 產生事後檢討模板
/incident --sev1                # SEV-1 rapid response checklist | SEV-1 快速回應清單
/incident --actions             # List open action items | 列出未完成行動項目
/incident --metrics             # Show incident trends | 顯示事故趨勢指標

Improvement Tracking | 改善追蹤

Action Item Lifecycle | 行動項目生命週期

Open ──► In Progress ──► Done ──► Verified
Status Description 說明
Open Identified, not started 已識別,未開始
In Progress Being worked on 進行中
Done Fix implemented 已實作修復
Verified Confirmed effective 已驗證有效

Incident Storage | 事故存放

docs/incidents/
├── INC-2026-03-15-api-outage.md
├── INC-2026-03-20-db-pool-exhaustion.md
└── README.md    # Index (optional)

Metrics Tracked | 追蹤指標

Metric Description 說明
MTTR Mean Time To Recovery 平均恢復時間
MTTD Mean Time To Detection 平均偵測時間
Frequency Incidents per period 每期事故數
Recurrence Repeated root causes 重複根因比例
Action Completion % of actions done 行動項目完成率

Next Steps Guidance | 下一步引導

After /incident completes, the AI assistant should suggest:

事故回應指引已提供。建議下一步 / Incident response guidance provided. Suggested next steps:

  • 提交修復 → 執行 /commit 建立修復提交 ⭐ Recommended / 推薦 — Create fix commit
  • 程式碼審查 → 執行 /review 審查修復變更 — Review fix changes
  • 記錄學習 → 執行 /docs 更新文件 — Document learnings
  • 安全審查 → 執行 /security 檢查安全影響 — Check security impact
  • 團隊回顧 → 執行 /retrospective(SEV-1/SEV-2 建議)— Team retrospective (recommended for SEV-1/2)
  • 查看行動項目 → 執行 /incident --actions — View open action items

Reference | 參考

Version History | 版本歷史

Version Date Changes 變更說明
1.1.0 2026-03-26 Add IMPROVE phase, action tracking, metrics 新增改善階段、行動追蹤、指標
1.0.0 2026-03-24 Initial release 初始版本

AI Agent Behavior | AI 代理行為

完整的 AI 行為定義請參閱對應的命令文件:/incident

For complete AI agent behavior definition, see the corresponding command file: /incident

License | 授權

CC BY 4.0 — Documentation content

Related skills
Installs
3
GitHub Stars
64
First Seen
Mar 28, 2026