incident-postmortem
Incident Postmortem Skill
This skill produces a complete, blameless incident postmortem document following industry-standard format. Output is ready to share with engineering teams, leadership, and affected stakeholders.
Required Inputs
Ask the user for these if not provided:
- Incident title / ID
- Severity (P1 / P2 / P3 or SEV1 / SEV2 / SEV3)
- Date and duration of the incident
- What happened (rough notes are fine — the skill will structure them)
- Services or systems affected
- Customer impact (how many users, what was degraded)
- How it was detected
- How it was resolved
- Initial thoughts on root cause
- Action items already identified (optional)
Output Structure
Incident Postmortem: [Incident Title]
Incident ID: [ID] Severity: [P1/P2/P3] Date: [Date] Duration: [Start time → Resolution time — total duration] Status: [Resolved / Monitoring / Ongoing] Author: [Leave blank for user to fill] Last updated: [Date]
Executive Summary
[3–5 sentences. Describe what happened, who was affected, and what was done to resolve it. Written for a non-technical stakeholder. No jargon. No blame.]
Impact
| Dimension | Details |
|---|---|
| Users affected | [Number or percentage] |
| Services degraded | [List affected services] |
| Business impact | [Revenue, SLA breach, support tickets, etc. if known] |
| Duration | [Total time from first detection to full resolution] |
Timeline
List events in chronological order. Each entry: [HH:MM UTC] — [What happened. Who did what. What changed.]
Rules for timeline entries:
- Use passive or system-focused language — avoid "X made a mistake"
- Include: first symptom, detection, escalation, hypothesis tested, fix applied, confirmation of resolution
- Note time between key events (e.g. "22 minutes between detection and escalation")
Root Cause
Primary root cause: [One clear sentence. Technical but plain. "A misconfigured deployment config caused..."]
Contributing factors:
- [Factor 1 — e.g. lack of canary deployment meant change hit 100% of traffic immediately]
- [Factor 2 — e.g. alert threshold was set too high to catch the initial degradation]
- [Factor 3 — add as many as are relevant]
Why did our existing safeguards not prevent this? [Honest paragraph explaining why monitoring, tests, or processes didn't catch this earlier. This is where blameless analysis matters most — focus on system gaps, not individual failures.]
Detection
- How was it first detected? [Customer report / automated alert / internal monitoring / manual observation]
- Time from incident start to detection: [X minutes]
- Should we have detected this faster? [Yes / No — and why]
Resolution
What fixed it? [Clear description of the actual fix — one paragraph] Why did this work? [Brief technical explanation] Was there a temporary mitigation before full resolution? [Yes/No — describe if yes]
Action Items
| # | Action | Owner | Due Date | Priority |
|---|---|---|---|---|
| 1 | [Specific, testable action] | [Team or person] | [Date] | P1/P2/P3 |
Rules for action items:
- Each action must be specific enough to close as "done" or "not done" — no vague items like "improve monitoring"
- Distinguish between: Prevent recurrence (fix the root cause), Improve detection (catch it faster next time), Improve response (resolve it faster next time)
- Assign a real owner — not "team" or "TBD" if avoidable
- Flag P1 actions as items that block the incident from being marked fully closed
What Went Well
[3–5 honest observations about the response. Include: fast collaboration, good runbooks used, effective escalation, clear communication. This section builds team confidence and reinforces good habits.]
Lessons Learned
[3–5 key insights from this incident that are worth sharing beyond this team. Write these as transferable lessons — e.g. "Our runbook for database failover didn't account for read-replica lag. All runbooks involving database failover should be reviewed."]
Communication Log
[Optional — list external communications sent: status page updates, customer emails, support responses. Include timestamps.]
Quality Checks
- Timeline has no blame-focused language
- Root cause is specific (not "human error")
- Contributing factors explain the systemic gaps
- Every action item has an owner and due date
- "What went well" section is genuine, not token
- Executive summary is readable by non-technical leadership
Example Trigger Phrases
- "Write a postmortem for the [incident name] outage"
- "Help me write a P1 incident report"
- "Generate an RCA document for [service] going down on [date]"
- "Draft a blameless postmortem from these notes: [paste notes]"
More from mohitagw15856/pm-claude-skills
user-research-synthesis
Analyze and synthesize user research findings into structured, actionable insights. Use when given user research data, interview transcripts, survey results, or user feedback that needs to be analyzed and summarised. Produces a themed synthesis with prevalence data, supporting quotes, pain points analysis, feature request prioritisation, and recommended next steps.
26prd-template
Create a Product Requirements Document following proven PM template structure. Use when asked to write a PRD, product spec, feature specification, or requirements document for a new feature or product. Produces a complete PRD with problem statement, user stories, functional requirements, technical considerations, and success metrics.
20stakeholder-update
Create executive stakeholder updates following proven communication frameworks. Use when the user needs to create a status update, progress report, executive summary, or communication for leadership, stakeholders, or executives.
19competitive-analysis
Analyze competitors and create competitive landscape documentation with feature matrices, positioning maps, and strategic recommendations. Use when asked to analyze competitors, create competitive analysis, compare features with competitors, build a competitive landscape, track competitive positioning, or prepare sales battlecard inputs. Produces structured competitor profiles, feature comparison matrix, win/loss analysis, and prioritised strategic recommendations.
18meeting-notes
Structure and format meeting notes following PM best practices. Use when asked to create meeting notes, format discussion notes, capture action items, or document decisions from any meeting type. Produces structured notes with decisions, action items (owner + deadline), open questions, and next steps.
17executive-summary
Write an executive summary for any document, report, or proposal. Use when asked to write an executive summary, management summary, briefing paper, or one-pager for senior stakeholders. Produces a structured summary that busy executives can read in under 3 minutes and act on.
15