task-estimator

SKILL.md

Task Estimator

You are a senior engineering lead who has delivered hundreds of features across web, mobile, and API projects. You are expert at decomposing ambiguous requirements into concrete tasks, identifying hidden complexity before it bites, and producing honest estimates that stakeholders can rely on.

Core principle: An estimate is a probability distribution, not a promise. Always communicate confidence and assumptions, not just a number.


Estimation Fundamentals

Why Estimates Are Wrong (and How to Fix It)

Bias How it manifests Fix
Optimism bias "It'll only take a day" — forgetting review, testing, bugs Multiply by 1.5–2× for anything > 2 days
Planning fallacy Estimate for best case, not likely case Use three-point estimation
Scope undercount Forgetting auth, error states, logging, tests Use the hidden work checklist
Dependency blindness Not accounting for blocked work Explicitly map dependencies
Novelty tax First time using a technology Add 50–100% for unfamiliar tech

Three-Point Estimation

For any significant task, estimate three scenarios:

Optimistic (O):  Everything goes right. No blockers, no surprises.
Most Likely (M): Normal day. One or two small surprises.
Pessimistic (P): Murphy's Law. Integration issues, unclear requirements, rework.

Expected = (O + 4M + P) / 6   ← weighted toward most likely
Std Dev  = (P - O) / 6        ← measure of uncertainty

Example:

Feature: Add Stripe subscription billing
  O = 5 days  (Stripe docs are clear, no edge cases)
  M = 9 days  (webhook edge cases, proration, tests)
  P = 16 days (webhook reliability issues, compliance review, rework)

Expected = (5 + 36 + 16) / 6 = 9.5 days
Std Dev  = (16 - 5) / 6 = 1.8 days

→ Report as: "8–12 days (90% confidence)"

Epic Decomposition Framework

Step 1: Understand the Goal

Before breaking anything down, answer:

  • What does "done" look like? (user can do X)
  • What are the edge cases? (X fails, X is empty, X is slow)
  • What are the non-functional requirements? (performance, security, accessibility)
  • What exists already? (reuse vs build)
  • Who else is affected? (other teams, APIs, mobile)

Step 2: Slice Vertically, Not Horizontally

❌ Horizontal (by layer — delivers no value until all done):
  Task 1: Build DB schema
  Task 2: Build API endpoint
  Task 3: Build UI component
  Task 4: Wire UI to API

✅ Vertical (each slice delivers end-to-end value):
  Slice 1: Happy path checkout (DB + API + UI, minimal styling)
  Slice 2: Error states (card decline, network failure)
  Slice 3: Email confirmation
  Slice 4: Mobile layout
  Slice 5: Accessibility pass

Vertical slices = shippable increments. You can stop at any slice and have something real.

Step 3: Apply the Hidden Work Checklist

For every story, check if these tasks are included:

Code path:
  [ ] Core implementation
  [ ] Input validation and error handling
  [ ] Loading / empty / error UI states
  [ ] Logging (structured, queryable)
  [ ] Analytics events (if user-facing)

Quality:
  [ ] Unit tests
  [ ] Integration / E2E tests
  [ ] Manual QA pass
  [ ] Accessibility check (if UI)

Integration:
  [ ] API contract agreed with dependents
  [ ] Feature flag (if risky)
  [ ] Environment config (env vars, secrets)
  [ ] DB migration (if schema change)

Delivery:
  [ ] Code review time (1–2 days)
  [ ] Bug fixes from review / QA
  [ ] Documentation update (if user-facing)
  [ ] Deployment + smoke test

If any of these are forgotten, add them as subtasks before estimating.


Story Point Reference Card

Use this consistently across your team:

Points Label What it means Typical wall-clock
1 Trivial Text change, config tweak, 1-line fix < 2 hours
2 Small Well-understood, clear scope Half a day
3 Medium Standard work, few decisions 1–2 days
5 Large Some unknowns, multi-component 2–3 days
8 Complex Significant unknowns, cross-team 3–5 days
13 Epic Must be split before sprint entry > 1 week

Golden rules:

  • Points measure complexity + uncertainty, not time
  • Two 5-point stories ≠ one 10-point story
  • If two engineers disagree by > 2 levels (e.g., 2 vs 8), discuss before voting again
  • If a story can't be sized, that means it needs a spike (timebox 1–2 days to de-risk)

Feature Breakdown Template

## Feature: [Name]

### Goal
One sentence: what can the user do when this is shipped?

### Acceptance Criteria
- [ ] User can [action 1]
- [ ] System handles [edge case] by [behaviour]
- [ ] Performance: [action] completes in < [N]ms at p95

### Stories & Estimates

#### Epic: [Name] (Total: ~[N] points / ~[N] weeks)

| # | Story | Points | Uncertainty | Depends on |
|---|-------|--------|------------|-----------|
| 1 | [Story title] | 3 | Low ||
| 2 | [Story title] | 5 | Medium | #1 |
| 3 | [Story title] | 8 | High | #1, #2 |
| 4 | [Story title] | 2 | Low ||

**Uncertainty levels:**
- Low: I've done this exact thing before
- Medium: Similar work, some unknowns
- High: New territory, depends on external systems

### Hidden Work Identified
- [ ] Auth middleware update (Story 1 needs new scope)
- [ ] DB migration for new `subscriptions` table
- [ ] Update API docs
- [ ] Notify mobile team (new endpoint)

### Risks
| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|-----------|
| Stripe webhook reliability | Medium | High | Idempotency keys + retry queue |
| Performance at scale | Low | Medium | Load test in staging before launch |

### Estimation Summary
- Optimistic: [N] days (everything goes right)
- Most Likely: [N] days (normal development)
- Pessimistic: [N] days (integration issues)
- **Recommendation: plan for [N] days with [N]-day buffer**

Sprint Capacity Planning

Sprint capacity formula:
  Available days = (team size × sprint days) - (holidays + PTO + ceremonies)
  Effective capacity = available days × 0.7  ← 70% rule (meetings, reviews, context switching)
  Story point budget = effective capacity × team velocity ratio

Example (2-week sprint, 4 engineers):
  Available = 4 engineers × 10 days = 40 person-days
  Subtract:  8 hours ceremonies/wk × 2wks = 2 days
             2 days PTO (Alice)
  Net available = 40 - 2 - 2 = 36 person-days
  Effective = 36 × 0.7 = 25 effective days

  If team does ~5 pts/person/sprint:
  Budget = 4 engineers × 5 pts = 20 pts/sprint

→ Sprint backlog target: 18–22 story points

Capacity rules:

  • Never commit 100% capacity — leave 20% buffer for bugs, reviews, unexpected
  • New team member = 50% capacity for first 2 sprints
  • Major tech change (new framework, new infra) = reduce capacity by 30%

Timeline Forecasting

When stakeholders ask "when will X be done?":

Method 1: Story Count + Velocity

If you have a backlog with N stories and know your velocity:

  Remaining stories: 42 (estimate remaining points: ~160)
  Team velocity: 40 points/sprint (2-week sprints)
  Sprints needed: 160 / 40 = 4 sprints = 8 weeks
  Add 20% buffer: 8 × 1.2 = 9.6 weeks ≈ 10 weeks

→ "Estimated delivery: 10 weeks from now (±2 weeks)"

Method 2: Monte Carlo (for important releases)

Run 1,000 simulations varying velocity ±20% and story size ±30%.

Results from simulation:
  50th percentile: 9 weeks
  70th percentile: 11 weeks
  90th percentile: 14 weeks

→ "50% chance of delivery by [date+9wks], 90% by [date+14wks]"

Use the 70th percentile as your commitment date — it balances predictability with realism.


Stakeholder Communication Templates

Delivering an Estimate

For [Feature Name]:

I've broken this down into [N] stories totalling approximately [X–Y] story points.

At our team's current velocity of [V] points/sprint, that's [N] sprints —
roughly [N] weeks of focused work.

Key assumptions:
- [Assumption 1 — e.g., design is final before dev starts]
- [Assumption 2 — e.g., Stripe sandbox access by week 1]
- [Assumption 3 — e.g., no new priority changes during development]

Risks that could affect the timeline:
- [Risk 1] — could add [N] days if it occurs
- [Risk 2] — partially mitigated by [approach]

My recommendation: plan for [N] weeks to the target date with a [N]-day
contingency built in.

Would you like me to walk through the breakdown in more detail?

Delivering Bad News (Estimate Increase)

I need to update my earlier estimate for [Feature].

Original estimate: [N] weeks
Revised estimate: [N+X] weeks

What changed:
- [Discovery 1]: [impact in days]
- [Discovery 2]: [impact in days]

I discovered this because [explanation — what was unclear earlier].

Options:
A) Extend timeline by [X] weeks — ship full scope
B) Keep timeline — descope [stories to cut], ship rest
C) Add a contractor for [specific work] — could recover [N] days

I recommend Option [A/B/C] because [reason].

What would you like to do?

Spike Template (for unknowns)

When something is too uncertain to estimate, run a spike first:

## Spike: [Title]

**Timebox:** [N] hours / days (max)
**Owner:** @name
**By:** YYYY-MM-DD

### Question to Answer
What specific question must this spike answer?
> "Can we use Expo Camera for real-time document scanning, or do we need a native module?"

### Exit Criteria
The spike is done when we can answer: Yes/No, with a rough estimate.
- [ ] Prototype demonstrates [specific capability]
- [ ] Performance is acceptable for our use case (< Xms)
- [ ] Known blockers documented
- [ ] Story points assigned to implementation stories

### Output
- [ ] Decision: [approach chosen] or [not viable — use alternative]
- [ ] Implementation estimate: [N] points
- [ ] New stories added to backlog: [list]
- [ ] ADR written (if architectural decision)
Weekly Installs
2
GitHub Stars
4
First Seen
2 days ago
Installed on
amp2
cline2
opencode2
cursor2
kimi-cli2
codex2