Lessons Learned

Structured retrospective analysis for incidents, mistakes, and near-misses. Transforms problems into systematic improvements.

Objective

Analyze incidents using a structured framework, identify root causes, and encode preventive measures directly into skills, guards, or documentation. The goal is systematic improvement, not blame.

Process

Phase 1: Incident Definition

Capture the facts first, analysis later.

## Incident Summary

**What happened:** [Factual description of the event]
**When:** [Date/time]
**Impact:** [What was affected, scope of damage]
**Resolution:** [How it was fixed/rolled back]
**Time to resolution:** [How long to fix]

Phase 2: Timeline Reconstruction

Build a chronological sequence of events:

## Timeline

| Time | Action | Actor | Outcome |
|------|--------|-------|---------|
| HH:MM | [What was done] | [Claude/User] | [Result] |
| HH:MM | [Next action] | [Claude/User] | [Result] |

Key questions:

What was the trigger?
Where did the sequence diverge from expected?
What was the point of no return?

Phase 3: Root Cause Analysis

Use the 5 Whys technique:

## Root Cause Analysis

1. Why did [incident] happen?
   → Because [immediate cause]

2. Why did [immediate cause] happen?
   → Because [deeper cause]

3. Why did [deeper cause] happen?
   → Because [systemic issue]

4. Why did [systemic issue] exist?
   → Because [process gap]

5. Why did [process gap] exist?
   → Because [root cause]

**Root Cause:** [The fundamental issue to address]

Phase 4: Contributing Factors

Identify all factors that contributed (not just the root cause):

Category	Factor	Contribution
Process	Missing checkpoint, unclear workflow	[How it contributed]
Communication	Ambiguous instructions, assumed consent	[How it contributed]
Technical	Missing guard, no validation	[How it contributed]
Context	Session continuation, prior assumptions	[How it contributed]
Human	Fatigue, time pressure, overconfidence	[How it contributed]

Phase 5: Fix Classification

Classify the fix by type and encode it appropriately:

Fix Type	When to Use	How to Encode
Skill	Recurring workflow needs structure	Create SKILL.md in ~/.claude/skills/
Guard	Action requires mandatory checkpoint	Add to skill with explicit approval gate
Documentation	Knowledge gap caused the issue	Update CLAUDE.md or relevant docs
Automation	Manual step was forgotten	Create hook or script
Checklist	Multiple steps need verification	Add to existing skill or create new one

Phase 6: Fix Implementation

Don't just recommend fixes -- implement them.

## Fixes Implemented

| Fix | Type | Location | Status |
|-----|------|----------|--------|
| [Description] | Skill/Guard/Doc | [File path] | Created |
| [Description] | Skill/Guard/Doc | [File path] | Updated |

Phase 7: Verification

How will we know the fix works?

## Verification

**Test scenario:** [How to test the fix]
**Success criteria:** [What "fixed" looks like]
**Review date:** [When to check if fix is working]

Output Template

# Lessons Learned: [Incident Title]

**Date:** YYYY-MM-DD
**Severity:** [Low/Medium/High/Critical]
**Status:** [Resolved/Monitoring/Open]

## Incident Summary

[Brief description]

## Timeline

| Time | Action | Actor | Outcome |
|------|--------|-------|---------|

## Root Cause

[The fundamental issue]

## Contributing Factors

- [Factor 1]
- [Factor 2]

## Fixes Implemented

| Fix | Type | Location | Status |
|-----|------|----------|--------|

## Prevention

[How this prevents recurrence]

## Lessons

1. [Key takeaway 1]
2. [Key takeaway 2]

Common Incident Patterns

Pattern: Premature Action

Symptom: Action taken before user approval Root cause: Implied consent interpreted as explicit consent Fix: Add explicit approval gate to relevant skill

## Approval Gate Template

Before [ACTION]:
1. Show user exactly what will happen
2. Ask: "Ready to [action]? (yes/no)"
3. Wait for explicit "yes" or "proceed"
4. Only then execute

Pattern: Sequence Error

Symptom: Steps executed in wrong order Root cause: Missing dependency chain in workflow Fix: Encode sequence in skill with numbered steps

Pattern: Missing Validation

Symptom: Bad data or invalid state passed through Root cause: No validation checkpoint Fix: Add validation step to skill or create pre-flight check

Pattern: Context Carryover

Symptom: Assumptions from prior session caused issue Root cause: Session state incorrectly assumed to persist Fix: Add explicit context verification at task start

Pattern: Scope Creep

Symptom: Did more than requested, caused unintended effects Root cause: Interpreted task scope too broadly Fix: Ask clarifying questions before expanding scope

Anti-Patterns

Don't do these:

Anti-Pattern	Problem	Instead
Blame assignment	Creates defensiveness, misses systemic issues	Focus on process, not people
Single-cause thinking	Oversimplifies, misses contributing factors	Use 5 Whys, identify multiple factors
Recommendation without action	Lessons forgotten, issue recurs	Implement fixes during retrospective
Vague fixes	"Be more careful" doesn't prevent recurrence	Encode specific, verifiable changes
Skip verification	No way to know if fix worked	Define success criteria and review date

Integration (Optional)

After completing a lessons learned session:

Log key findings to your daily note or project log
If a skill was created/modified, the skill itself is the durable artifact
Consider adding high-severity incidents to a dedicated incident log

Success Criteria

The retrospective is complete when:

Incident clearly defined with timeline
Root cause identified (not just symptoms)
Contributing factors documented
At least one fix implemented (not just recommended)
Fix is encoded in appropriate location (skill, doc, hook)
Verification criteria defined
Key lessons summarized

lessons-learned