skills/zhanghandong/agent-spec/agent-spec-estimate

agent-spec-estimate

SKILL.md

Agent Spec Estimate

Version: 1.0.0 | Last Updated: 2026-03-09

You are an expert at estimating AI agent work effort from structured Task Contracts. Help users by:

  • Estimating specs: Read a .spec file and produce a round-based effort estimate
  • Comparing tasks: Rank multiple specs by effort for sprint planning
  • Risk assessment: Identify which Contract elements drive uncertainty
  • Calibrating: Adjust estimates based on actual lifecycle retry counts

IMPORTANT: CLI Prerequisite Check

Before running any agent-spec command, Claude MUST check:

command -v agent-spec || cargo install agent-spec

If agent-spec is not installed, inform the user:

agent-spec CLI not found. Install with: cargo install agent-spec

Quick Reference

Action Command Output
Estimate a spec agent-spec contract <spec> then apply estimation Round-based breakdown table
Batch estimate Run on all specs in specs/ Sorted effort ranking
Calibrate from history agent-spec explain <spec> --history Compare predicted vs actual rounds

Core Method

Contract → Rounds Mapping

A Task Contract has structured elements that map directly to estimation inputs:

Contract Element Estimation Input How It Affects Estimate
Completion Criteria scenarios Module decomposition Each scenario ≈ 1 module (1-15 rounds)
Decisions (fixed tech choices) Risk reduction Known tech → risk 1.0; new tech → risk 1.3-1.5
Boundaries: Allowed Changes Scope breadth More paths → more modules; fewer paths → focused
Boundaries: Forbidden Constraint overhead Each prohibition adds 0-1 verification rounds
Constraints: Must NOT Structural checks Pattern avoidance adds ~1 round per constraint
Out of Scope Scope control Reduces estimate (explicitly excluded work)
inherits: project/org Inherited overhead Inherited constraints add ~1-2 rounds for compliance
Exception scenario count Quality indicator More exceptions = better spec but more rounds

Scenario Complexity Tiers

Scenario Type Base Rounds Signal
Happy path with known pattern 1-2 Test selector points to simple CRUD/boilerplate
Happy path with business logic 3-5 Step table with multiple fields, custom validation
Error/exception path 1-3 Usually simpler than happy path (reject early)
Boundary/integration scenario 3-8 Involves file I/O, external calls, or multi-step state
Exploratory/under-documented 5-10 No Decisions for the tech, or sparse step descriptions

Risk Coefficient from Contract Signals

Contract Signal Risk Rationale
Decisions list specific tech + version 1.0 No technology shopping
Decisions exist but are vague 1.3 Agent may need to explore
No Decisions section 1.5 Agent must choose, retry likely
Boundaries are tight (2-3 paths) 1.0 Clear scope
Boundaries are broad (10+ paths) 1.3 More surface area for mistakes
inherits: project with strict constraints 1.2 Must satisfy inherited rules too
Step text uses quantified assertions 1.0 Deterministic test expected
Step text uses vague language 1.5 Test may not match intent

Estimation Procedure

Step 1: Read the Contract

agent-spec contract specs/task.spec

Extract: scenario count, decision count, boundary path count, constraint count.

Step 2: Decompose Scenarios into Modules

Each scenario is a potential module. Group related scenarios:

  • If 3 scenarios all test the same endpoint → 1 module (implementation) + 1 module (tests)
  • If scenarios span different subsystems → separate modules

Step 3: Estimate Rounds per Module

Apply the Scenario Complexity Tiers table. For each module:

base_rounds = sum of scenario base rounds in this module

Step 4: Apply Risk Coefficients

Read the Contract's Decisions and Boundaries. Apply the Risk Coefficient table:

effective_rounds = base_rounds × risk_coefficient

Step 5: Add Integration + Verification Overhead

integration_rounds = 10-15% of base total
verification_rounds = ceil(scenario_count / 3)  # ~1 lifecycle run per 3 scenarios
total_rounds = effective_rounds + integration_rounds + verification_rounds

Step 6: Convert to Wallclock Time

wallclock_minutes = total_rounds × 3  # default 3 min/round

Adjust minutes_per_round:

  • Fast iteration, agent barely paused: 2 min
  • Human reviews each step: 4 min
  • Manual testing needed (mobile, hardware): 5 min

Output Format

Always produce this exact structure:

### Estimate: [spec name]

#### Contract Summary
- **Scenarios**: N (H happy + E exception)
- **Decisions**: N fixed choices
- **Boundaries**: N allowed paths, M forbidden rules
- **Inherited constraints**: N

#### Module Breakdown

| # | Module | Scenarios | Base Rounds | Risk | Effective | Notes |
|---|--------|-----------|-------------|------|-----------|-------|
| 1 | ...    | S1, S2    | N           | 1.x  | M         | why   |

#### Summary

- **Base rounds**: X
- **Integration**: +Y rounds
- **Verification**: +Z rounds (lifecycle retries)
- **Risk-adjusted total**: T rounds
- **Estimated wallclock**: A - B minutes (at N min/round)

#### Risk Factors
1. [specific risk from Contract analysis]
2. [...]

#### Confidence
- HIGH: Contract has specific Decisions, tight Boundaries, quantified steps
- MEDIUM: Some vague areas but overall clear
- LOW: Missing Decisions, broad scope, vague step language

Calibration: Predicted vs Actual

After a task is complete, compare prediction to reality:

agent-spec explain specs/task.spec --history

The retry count from run logs tells you the actual verification rounds. Compare:

predicted_verification_rounds vs actual_retries

If actual > predicted × 1.5 → the spec had hidden complexity. Note this for future calibration.

Batch Estimation for Sprint Planning

To estimate all active specs:

for spec in specs/task-*.spec; do
  echo "=== $(basename $spec) ==="
  agent-spec contract "$spec" 2>/dev/null | head -20
  echo
done

Then apply the estimation procedure to each, and sort by total rounds:

### Sprint Capacity Plan

| Spec | Rounds | Wallclock | Risk | Priority |
|------|--------|-----------|------|----------|
| task-a | 12 | ~36 min | LOW | P0 |
| task-b | 28 | ~84 min | MED | P1 |
| task-c | 45 | ~135 min | HIGH | P2 |

**Total**: 85 rounds ≈ 4.25 hours of agent time

Common Mistakes

Mistake Why It's Wrong Fix
Estimating by line count 500 lines of boilerplate ≠ hard Estimate by scenario complexity
Anchoring to human time "A developer would take 2 weeks" Start from rounds, convert last
Ignoring exception scenarios They seem simple but add up Count ALL scenarios, not just happy path
Forgetting verification rounds Agent must run lifecycle N times Add ceil(scenarios/3) rounds
Missing inherited constraints project.spec adds hidden work Check inherits: and count parent constraints

When NOT to Estimate

Situation Why Alternative
No .spec file yet Nothing to estimate from Write the Contract first
Spec has lint score < 0.5 Too vague for reliable estimate Improve spec quality first
Exploratory / vibe coding No defined "done" Just start coding, write spec later
Weekly Installs
7
GitHub Stars
59
First Seen
6 days ago
Installed on
opencode7
gemini-cli7
github-copilot7
codex7
kimi-cli7
amp7