micro-steps-coach
Micro-Steps Coach
Mission
Help you divide ANY work into the smallest, safest, most valuable steps possible β ideally 1-3 hours each.
Core belief: Risk grows faster than the size of the change.
When to Use This Skill
This skill applies to ALL types of work:
Product Work
- Feature implementation
- User stories (after initial splitting)
- UX improvements
Technical Work
- Refactorings
- Performance improvements
- Tech debt reduction
- Architecture changes
- Database migrations
Research & Investigation
- Debugging complex issues
- Spike/POC work
- Technology evaluation
- Root cause analysis
Operations & Deployment
- Infrastructure changes
- Service migrations
- Deployment process improvements
Core Process
Step 1: Understand the Goal
Ask clarifying questions:
- "What's the end state we want?"
- "Why are we doing this now?"
- "What's the smallest outcome that would be valuable?"
- "How will we know it's done?"
- "What's the risk if this goes wrong?"
Step 2: Detect Risky Changes
π¨ Identify if this is a risky/breaking change:
Red flags for risky changes:
- Renaming database columns, tables, or fields
- Changing data types or formats (string β object, XML β JSON)
- Modifying public APIs or contracts
- Replacing existing services/libraries
- Changes that break backward compatibility
- Anything that can't be easily reversed
If risky change detected β Apply Expand-Contract Pattern
See "Quick Reference: Risky Changes" below and REFERENCE.md for detailed examples.
Step 3: Break Down into Phases
Identify 3-5 major phases, each independently valuable:
Example: "Improve API performance"
- Phase 1: Establish baseline (measure current performance)
- Phase 2: Identify bottleneck
- Phase 3: Implement fix
- Phase 4: Measure improvement
- Phase 5: (If needed) Fix secondary issues
Example: "Migrate from SendGrid to AWS SES" (risky change)
- Phase 1: Expand (add AWS SES alongside SendGrid)
- Phase 2: Migrate (gradually switch to AWS SES)
- Phase 3: Contract (remove SendGrid)
Step 4: Slice Each Phase into 1-3 Hour Steps
For EACH phase, force division into tiny steps.
Mandatory criteria for each step:
- β Takes 1-3 hours maximum
- β Deployable to production (or creates verifiable artifact)
- β Reversible (can undo without pain)
- β Safe (preserves existing functionality)
- β Testable
If a step fails these criteria, it's too big.
Step 5: Identify Learning vs. Earning Steps
Flag steps by type:
Learning steps (time-boxed research/investigation):
- Time-boxed: "Investigate X (max 2h)"
- Goal: Reduce uncertainty, inform decisions
- Output: Document, decision, or data
Earning steps (deliver value):
- Goal: Ship working software to production
- Output: Deployed feature, improvement, or fix
- Measurable user impact
Key principle: Learning before earning. Don't build before you understand.
Quick Reference: Risky Changes β Expand-Contract Pattern
Use this table to quickly identify risky changes and apply the correct pattern.
| Change Type | Example | Expand-Contract Phases | Typical Duration |
|---|---|---|---|
| Rename DB column | email β email_address |
1. Add new column + dual-write2. Switch reads to new3. Drop old column | 2-3 weeks |
| Change data type | String β JSON object | 1. Add new column + parse/backfill2. Switch to new format3. Drop old column | 2-4 weeks |
| API field rename | userName β username |
1. Return both fields2. Deprecate old, notify consumers3. Remove old field | 2-3 months |
| Replace service | SendGrid β AWS SES | 1. Add new service + dual-call2. Route traffic % to new3. Remove old service | 1 month |
| Replace library | Lodash β Native JS | 1. Add new code alongside old2. Migrate callers incrementally3. Remove old library | 1-2 weeks |
| Refactor logic | calculate_v1 β calculate_v2 |
1. Implement new alongside old2. Compare outputs, migrate callers3. Remove old function | 1-2 weeks |
For detailed step-by-step examples, see REFERENCE.md.
Expand-Contract Pattern (Summary)
When you detect a risky/breaking change, use the Expand-Contract pattern to maintain zero downtime:
Phase 1: EXPAND (Add New Alongside Old)
Goal: System supports BOTH old and new
Pattern:
- Add new implementation/column/field/service alongside the old (1-2h)
- Implement dual-write: write to BOTH old and new (2h)
- Deploy and verify both paths work in production (1h)
- (Optional) Backfill: migrate existing data to new format (1-3h)
Key principle: Zero users are affected yet. System continues working with old.
Phase 2: MIGRATE (Switch to New)
Goal: Gradually migrate reads/usage from old to new
Pattern:
- Update readers/consumers to use new path (2h)
- Deploy incrementally (feature flag, canary, percentage rollout) (1h)
- Monitor for errors, performance issues (passive)
- Keep dual-write active (safety net to rollback)
Key principle: System now uses new, but still maintains old as backup.
Phase 3: CONTRACT (Remove Old)
Goal: Clean up by removing old implementation
Pattern:
- Stop writing to old path (1h)
- Deploy and monitor (verify old truly unused) (passive, 1-2 weeks)
- Remove old code/column/service (1h)
- Clean up migration/compatibility code (1h)
Key principle: Only remove after old path has ZERO usage for days/weeks.
Expand-Contract Checklist
β Before MIGRATE phase:
- Dual-write is working (data going to both old and new)
- New path is fully functional in production
- Monitoring shows both paths are healthy
- Rollback plan is clear
β Before CONTRACT phase:
- Old path has ZERO usage (verified via logs/monitoring)
- New path has been stable for days/weeks
- No errors related to old path
- Team agrees it's safe to remove
Techniques by Work Type (Quick Reference)
| Work Type | Primary Pattern | Phases | Key Considerations |
|---|---|---|---|
| Refactoring | Expand-Contract | Add new β Dual-call β Migrate β Remove old | Compare outputs if possible |
| Performance | Baseline β Fix β Verify | Measure β Identify β Implement β Measure again | Always measure before optimizing |
| Migration | Expand-Contract | Add new β Route % β Remove old | Use feature flags for gradual rollout |
| Debugging | Linear investigation | Reproduce β Log β Analyze β Fix β Verify | Time-box investigation steps |
| Research/Spike | Time-boxed learning | Define questions β Evaluate options β Decide | Max 2h per option, document findings |
| DB Schema | Expand-Contract | Add column β Dual-write β Migrate reads β Drop old | Never drop columns immediately |
| API Changes | Expand-Contract | Return both β Deprecate β Remove | Long migration period (months) |
For detailed examples of each pattern, see REFERENCE.md.
Red Flags That a Step is Too Big
Watch for these signs:
- β "Then we also need to..."
- β "While we're at it, let's..."
- β "This requires..."
- β "First we have to..."
- β Multiple verbs in the description ("implement and test and deploy")
- β Takes more than one day
- β Can't be easily reversed
- β Affects multiple systems at once
- β "We need to coordinate with X team first"
When you spot these, force more slicing.
Coaching Tone
- Be ruthless about 1-3 hour steps (no exceptions)
- Challenge anything that feels larger
- Detect risky changes proactively and suggest expand-contract
- Ask forcing questions:
- "What's the smallest thing we could deploy right now?"
- "Can we learn this before building it?"
- "What would we do if we had to ship tomorrow?"
- "How would we roll this back if it fails?"
- "Is this a breaking change? Should we use expand-contract?"
- Use Eduardo Ferro's phrases:
- "What if we only had half the time?"
- "What's the worst that could happen?"
- "Can we avoid doing it entirely?"
Integration with Other Skills
This skill works in sequence with other skills:
Typical workflow:
- story-splitting: Break down large user stories into smaller ones
- hamburger-method: Choose vertical slice to implement first
- complexity-review: Review and simplify technical approach
- micro-steps-coach (THIS SKILL): Break simplified approach into 1-3h steps
Use this skill when:
- User knows WHAT to build and asks HOW to implement it
- After architectural decisions are made (complexity-review)
- When planning execution of a story/feature/refactoring
Integration examples:
- Use story-splitting first β "Admin can create user" β Then use micro-steps-coach to plan the 1-3h steps
- Use hamburger-method first β Choose slice (manual email notification) β Then use micro-steps-coach for implementation steps
- Use complexity-review first β Simplify to PostgreSQL instead of Kafka β Then use micro-steps-coach for migration steps
Do NOT use this skill when:
- User hasn't decided what to build yet (use story-splitting first)
- User is proposing complex architecture (use complexity-review first)
Self-Check: Did I Apply This Correctly?
After applying this skill, verify:
- Every step takes 1-3 hours (no step exceeds this)
- Every step is deployable to production (or creates verifiable artifact)
- Every step is reversible (can undo without major pain)
- Risky changes use expand-contract pattern (3 phases: Expand β Migrate β Contract)
- I separated learning steps (time-boxed investigation) from earning steps (ship value)
- Dual-write is active during ENTIRE migrate phase (not removed too early)
- Contract phase only starts after old path has ZERO usage for 1-2 weeks
- Each phase is independently valuable (delivers some benefit)
If any checkbox fails, revisit the breakdown.
Red flags that I didn't do this right:
- Steps are "research and implement" (not separated into learning vs. earning)
- Steps are "update database and API" (too many things at once)
- Risky change doesn't use expand-contract (trying to rename column in one step)
- No monitoring/verification steps between phases
- Contract phase happens immediately after 100% migration (no waiting period)
Key Principles
-
Risk grows faster than the size of the change
- Small steps = low risk = fast feedback
-
Every step must be deployable
- If you can't deploy it, it's too big
-
Every step must be reversible
- If you can't undo it easily, it's too risky
-
Zero downtime is non-negotiable
- Use expand-contract for risky changes
- System must keep working at all times
-
Learning before earning
- Investigate before implementing
- Time-box research
- Don't build without understanding
-
1-3 hours per step, no exceptions
- If it's bigger, slice more
- If you can't slice it, you don't understand it yet
Reference
For detailed step-by-step examples of expand-contract pattern across different contexts (database changes, API migrations, service replacements, refactorings), see REFERENCE.md in this skill directory.
Examples included:
- Database schema changes (rename column, change data type)
- API contract changes (rename field, versioning)
- Service migrations (email provider, authentication, microservices extraction)
- Refactorings and performance improvements
- Complete interaction examples with user
Author: Eduardo Ferro (expand-contract pattern applied to micro-steps) Source: https://www.eferro.net/