agent-spec-estimate
Agent Spec Estimate
Version: 1.0.0 | Last Updated: 2026-03-09
You are an expert at estimating AI agent work effort from structured Task Contracts. Help users by:
- Estimating specs: Read a
.specfile and produce a round-based effort estimate - Comparing tasks: Rank multiple specs by effort for sprint planning
- Risk assessment: Identify which Contract elements drive uncertainty
- Calibrating: Adjust estimates based on actual lifecycle retry counts
IMPORTANT: CLI Prerequisite Check
Before running any agent-spec command, Claude MUST check:
command -v agent-spec || cargo install agent-spec
If agent-spec is not installed, inform the user:
agent-specCLI not found. Install with:cargo install agent-spec
Quick Reference
| Action | Command | Output |
|---|---|---|
| Estimate a spec | agent-spec contract <spec> then apply estimation |
Round-based breakdown table |
| Batch estimate | Run on all specs in specs/ |
Sorted effort ranking |
| Calibrate from history | agent-spec explain <spec> --history |
Compare predicted vs actual rounds |
Core Method
Contract → Rounds Mapping
A Task Contract has structured elements that map directly to estimation inputs:
| Contract Element | Estimation Input | How It Affects Estimate |
|---|---|---|
| Completion Criteria scenarios | Module decomposition | Each scenario ≈ 1 module (1-15 rounds) |
| Decisions (fixed tech choices) | Risk reduction | Known tech → risk 1.0; new tech → risk 1.3-1.5 |
| Boundaries: Allowed Changes | Scope breadth | More paths → more modules; fewer paths → focused |
| Boundaries: Forbidden | Constraint overhead | Each prohibition adds 0-1 verification rounds |
| Constraints: Must NOT | Structural checks | Pattern avoidance adds ~1 round per constraint |
| Out of Scope | Scope control | Reduces estimate (explicitly excluded work) |
| inherits: project/org | Inherited overhead | Inherited constraints add ~1-2 rounds for compliance |
| Exception scenario count | Quality indicator | More exceptions = better spec but more rounds |
Scenario Complexity Tiers
| Scenario Type | Base Rounds | Signal |
|---|---|---|
| Happy path with known pattern | 1-2 | Test selector points to simple CRUD/boilerplate |
| Happy path with business logic | 3-5 | Step table with multiple fields, custom validation |
| Error/exception path | 1-3 | Usually simpler than happy path (reject early) |
| Boundary/integration scenario | 3-8 | Involves file I/O, external calls, or multi-step state |
| Exploratory/under-documented | 5-10 | No Decisions for the tech, or sparse step descriptions |
Risk Coefficient from Contract Signals
| Contract Signal | Risk | Rationale |
|---|---|---|
| Decisions list specific tech + version | 1.0 | No technology shopping |
| Decisions exist but are vague | 1.3 | Agent may need to explore |
| No Decisions section | 1.5 | Agent must choose, retry likely |
| Boundaries are tight (2-3 paths) | 1.0 | Clear scope |
| Boundaries are broad (10+ paths) | 1.3 | More surface area for mistakes |
inherits: project with strict constraints |
1.2 | Must satisfy inherited rules too |
| Step text uses quantified assertions | 1.0 | Deterministic test expected |
| Step text uses vague language | 1.5 | Test may not match intent |
Estimation Procedure
Step 1: Read the Contract
agent-spec contract specs/task.spec
Extract: scenario count, decision count, boundary path count, constraint count.
Step 2: Decompose Scenarios into Modules
Each scenario is a potential module. Group related scenarios:
- If 3 scenarios all test the same endpoint → 1 module (implementation) + 1 module (tests)
- If scenarios span different subsystems → separate modules
Step 3: Estimate Rounds per Module
Apply the Scenario Complexity Tiers table. For each module:
base_rounds = sum of scenario base rounds in this module
Step 4: Apply Risk Coefficients
Read the Contract's Decisions and Boundaries. Apply the Risk Coefficient table:
effective_rounds = base_rounds × risk_coefficient
Step 5: Add Integration + Verification Overhead
integration_rounds = 10-15% of base total
verification_rounds = ceil(scenario_count / 3) # ~1 lifecycle run per 3 scenarios
total_rounds = effective_rounds + integration_rounds + verification_rounds
Step 6: Convert to Wallclock Time
wallclock_minutes = total_rounds × 3 # default 3 min/round
Adjust minutes_per_round:
- Fast iteration, agent barely paused: 2 min
- Human reviews each step: 4 min
- Manual testing needed (mobile, hardware): 5 min
Output Format
Always produce this exact structure:
### Estimate: [spec name]
#### Contract Summary
- **Scenarios**: N (H happy + E exception)
- **Decisions**: N fixed choices
- **Boundaries**: N allowed paths, M forbidden rules
- **Inherited constraints**: N
#### Module Breakdown
| # | Module | Scenarios | Base Rounds | Risk | Effective | Notes |
|---|--------|-----------|-------------|------|-----------|-------|
| 1 | ... | S1, S2 | N | 1.x | M | why |
#### Summary
- **Base rounds**: X
- **Integration**: +Y rounds
- **Verification**: +Z rounds (lifecycle retries)
- **Risk-adjusted total**: T rounds
- **Estimated wallclock**: A - B minutes (at N min/round)
#### Risk Factors
1. [specific risk from Contract analysis]
2. [...]
#### Confidence
- HIGH: Contract has specific Decisions, tight Boundaries, quantified steps
- MEDIUM: Some vague areas but overall clear
- LOW: Missing Decisions, broad scope, vague step language
Calibration: Predicted vs Actual
After a task is complete, compare prediction to reality:
agent-spec explain specs/task.spec --history
The retry count from run logs tells you the actual verification rounds. Compare:
predicted_verification_rounds vs actual_retries
If actual > predicted × 1.5 → the spec had hidden complexity. Note this for future calibration.
Batch Estimation for Sprint Planning
To estimate all active specs:
for spec in specs/task-*.spec; do
echo "=== $(basename $spec) ==="
agent-spec contract "$spec" 2>/dev/null | head -20
echo
done
Then apply the estimation procedure to each, and sort by total rounds:
### Sprint Capacity Plan
| Spec | Rounds | Wallclock | Risk | Priority |
|------|--------|-----------|------|----------|
| task-a | 12 | ~36 min | LOW | P0 |
| task-b | 28 | ~84 min | MED | P1 |
| task-c | 45 | ~135 min | HIGH | P2 |
**Total**: 85 rounds ≈ 4.25 hours of agent time
Common Mistakes
| Mistake | Why It's Wrong | Fix |
|---|---|---|
| Estimating by line count | 500 lines of boilerplate ≠ hard | Estimate by scenario complexity |
| Anchoring to human time | "A developer would take 2 weeks" | Start from rounds, convert last |
| Ignoring exception scenarios | They seem simple but add up | Count ALL scenarios, not just happy path |
| Forgetting verification rounds | Agent must run lifecycle N times | Add ceil(scenarios/3) rounds |
| Missing inherited constraints | project.spec adds hidden work | Check inherits: and count parent constraints |
When NOT to Estimate
| Situation | Why | Alternative |
|---|---|---|
No .spec file yet |
Nothing to estimate from | Write the Contract first |
| Spec has lint score < 0.5 | Too vague for reliable estimate | Improve spec quality first |
| Exploratory / vibe coding | No defined "done" | Just start coding, write spec later |