Forge

Delegate complex, multi-step development work to an autonomous agent that builds and verifies code.

When to Use Forge

The task targets a different repo (-C ~/other-project)
The work is complex enough to benefit from autonomous agent execution with verification
You have spec files describing outcomes to implement
You want to run multiple specs in parallel

Running Forge Commands

Forge uses the Agent SDK internally. SDK-invoking commands cannot run inside Claude Code (nested SDK restriction). The CLI will block with a clear error if you try.

SDK commands (run, audit, define, review, proof, verify, specs --check): Build the command and present it to the user. Do NOT execute via Bash.

The forge command to run:

  forge run --spec-dir specs/ "implement all"

Run this in your forge tmux pane. I can run `forge watch` here for live progress.

Non-SDK commands (specs, status, stats, watch): Run directly via Bash — these are safe, read-only operations.

After presenting an SDK command, offer to run forge watch for live tailing (it's read-only and works inside Claude Code).

Commands

forge run

forge run "add auth middleware"                          # Simple task
forge run --spec specs/auth.md "implement this"          # With spec file
forge run --spec-dir ./specs/ "implement all"             # Parallel specs (default)
forge run -C ~/other-repo "fix the login bug"            # Target different repo
forge run --rerun-failed "fix failures"                  # Rerun failed specs
forge run --pending "implement pending"                  # Run only pending specs
forge run --resume <session-id> "continue"               # Resume interrupted session
forge run --plan-only "design API for auth"              # Plan without implementing
forge "quick task"                                       # Shorthand (no 'run')

Important flags:

-s, --spec <path> -- Spec file (shorthand resolves via manifest). Prompt becomes additional context.
-S, --spec-dir <path> -- Directory of specs (shorthand resolves via known dirs). Runs each .md in parallel by default. Already-passed specs are skipped.
--sequential -- Run specs sequentially instead of parallel (default: parallel).
-F, --force -- Re-run all specs including already passed.
-B, --branch <name> -- Run in an isolated git worktree on the named branch. Auto-commits on success, cleans up after.
--concurrency <n> -- Override auto-detected parallelism (default: freeMem/2GB, capped at CPUs).
--sequential-first <n> -- Run first N specs sequentially, then parallelize.
-C, --cwd <path> -- Target repo directory.
-t, --max-turns <n> -- Max turns per spec (default: 250).
-b, --max-budget <usd> -- Max budget in USD per spec.
--plan-only -- Create tasks without implementing.
--dry-run -- Preview tasks and estimate cost without executing.
-v, --verbose -- Full output detail.
-q, --quiet -- Suppress progress output (for CI).
-w, --watch -- Auto-split tmux pane with live logs.

forge audit

Reviews codebase against specs. Produces new spec files for remaining work — feed them back into forge run --spec-dir.

forge audit specs/                              # Audit all specs in directory
forge audit specs/auth.md                       # Audit a single spec file
forge audit auth.md                             # Shorthand (resolves via manifest)
forge audit specs/ "focus on auth module"       # With additional context
forge audit specs/ -o ./remediation/            # Custom output dir
forge audit specs/ -C ~/target-repo             # Different repo
forge audit specs/ --watch                      # Auto-split tmux pane with live logs
forge audit specs/ --fix                        # Audit-fix loop (audit -> fix -> re-audit)
forge audit specs/ --fix --fix-rounds 5         # Custom max rounds (default: 3)

forge define

Analyzes codebase and generates outcome spec files from a high-level description. Closes the loop: forge define → forge specs → forge <spec-dir> "implement".

forge define "build auth system"                # Generate specs in specs/
forge define "add rate limiting" -o specs/api/  # Custom output dir
forge define "refactor database" -C ~/project   # Different repo

forge review

Reviews recent git changes for bugs and quality issues.

forge review                                    # Review main...HEAD
forge review HEAD~5...HEAD                      # Specific range
forge review --dry-run -o findings.md           # Report only, write to file
forge review -C ~/other-repo                    # Different repo

forge proof

Generates real test files from implemented specs. Reads specs + codebase, writes .test.ts files colocated with source (or in the project's test directory), a manual.md human checklist, and a manifest.json. Auto-detects test convention and framework. Supports multiple spec paths. forge prove is a backward-compatible alias.

forge proof specs/feature.md                    # Single spec proof
forge proof specs/                              # All specs in directory
forge proof specs/a.md specs/b.md specs/c.md    # Multiple specific specs
forge proof specs/ -o ./custom-proofs/          # Custom manifest output dir
forge proof specs/ -C ~/other-repo              # Different repo

forge pipeline

Chains define → run → audit → proof → verify into a single automated flow with observable gates. The pipeline process stays alive and polls for gate changes — TUI/MCP approve gates by writing state, not by spawning processes.

forge pipeline "build auth system"                  # Full pipeline
forge pipeline --from run --spec-dir specs/ "go"    # Start at run with existing specs
forge pipeline --gate-all confirm "careful build"   # Pause at every gate
forge pipeline --resume <pipeline-id>               # Resume paused/failed pipeline
forge pipeline status                               # Show current pipeline state

Gates default to: auto (define→run, run→audit, proof→verify), confirm (audit→proof). TUI controls: a advance, s skip, p pause, c cancel.

forge watch

Live-tail session logs with colored output. Auto-follows to next session during batch runs; exits after final session or 60s timeout.

forge watch                                     # Watch latest session (auto-follows batch)
forge watch <session-id>                        # Watch specific session (no auto-follow)
forge watch -C ~/other-repo                     # Watch in different repo

forge status

forge status                                    # Latest run
forge status --all                              # All runs
forge status -n 5                               # Last 5 runs
forge status -C ~/other-repo                    # Different repo

forge stats

Aggregate run statistics across all results.

forge stats                                     # Dashboard: runs, cost, success rate
forge stats --by-spec                           # Per-spec breakdown from manifest
forge stats --by-model                          # Per-model breakdown
forge stats --since 2026-03-01                  # Filter runs after date
forge stats -C ~/other-repo                     # Different repo

forge specs

List tracked specs with lifecycle status. Specs are registered in .forge/specs.json as they're run.

forge specs                                     # List all tracked specs
forge specs --pending                           # Show only pending
forge specs --failed                            # Show only failed
forge specs --passed                            # Show only passed
forge specs --orphaned                          # Manifest entries with missing files
forge specs --untracked                         # .md files not in manifest
forge specs --add                               # Register all untracked specs
forge specs --add specs/new.md                  # Register specific spec by path/glob
forge specs --resolve game.md                   # Mark spec as passed without running
forge specs --unresolve game.md                 # Reset a spec back to pending
forge specs --check                             # Triage pending specs: auto-resolve already-implemented ones via Sonnet agent
forge specs --reconcile                         # Backfill from .forge/results/ history
forge specs --prune                             # Remove orphaned entries from manifest
forge specs --summary                           # Directory-level roll-up (compact view)

Important

Never manually orchestrate parallel forge runs (e.g. forge run spec1.md & forge run spec2.md & wait). Forge handles parallelism, dependency ordering, and skip-passed internally via --spec-dir. Manual orchestration bypasses the dependency graph, manifest tracking, and batch grouping.

Always prefer --spec-dir over running individual specs. It automatically:

Skips already-passed specs (use --force to override)
Resolves depends: frontmatter into a topological execution order
Tracks all specs in a single batch with grouped cost reporting
Auto-tunes concurrency based on available memory

--spec takes exactly one file. The prompt is always the last positional argument (a quoted string). Bare file paths without a flag are interpreted as the prompt, not as spec files. To run multiple specs, use --spec-dir.

Shorthand resolution: spec paths resolve automatically. forge run --spec login.md finds the spec via manifest lookup. forge run --spec-dir gtmeng-580 finds .bonfire/specs/gtmeng-580/. Full paths always work too.

Pipeline is autonomous end-to-end. forge pipeline (or forge_pipeline_start via MCP) runs define → run → audit → proof → verify automatically. When you start a pipeline:

Wait for it to complete by polling with forge_task. Do NOT commit, push, or create PRs while the pipeline is still running — it owns the repo during execution.
Do NOT run individual stages (e.g. forge audit, forge proof) after a pipeline — it already ran them.
Do NOT commit, push, or create PRs while a pipeline is running — the verify stage creates a PR automatically when the pipeline completes.
Only use individual commands when you are NOT using pipeline.

Common Mistakes

# WRONG: bare paths without --spec are treated as the prompt string
forge run specs/auth.md specs/login.md

# WRONG: --spec only takes one file, not multiple
forge run --spec specs/auth.md specs/login.md "implement"

# WRONG: manually running specs one-by-one bypasses dependency ordering
forge run --spec specs/01-schema.md "implement" && forge run --spec specs/02-api.md "implement"

# RIGHT: put specs in a directory, use --spec-dir
forge run --spec-dir specs/ "implement all"

# RIGHT: single spec with --spec
forge run --spec specs/auth.md "implement this"

Recipes

Run all specs in a directory

forge run --spec-dir gtmeng-580 -C ~/dev/project "implement all"
# Shorthand paths resolve automatically (gtmeng-580 → .bonfire/specs/gtmeng-580/)
# Already-passed specs are skipped; deps on passed specs are treated as satisfied

Run a subset of specs from a directory

If specs use depends: frontmatter, --spec-dir automatically runs only the ready ones in topological order. Dependent specs wait for their deps to pass — no need to cherry-pick individual specs.

# Specs declare their dependencies:
#   03-api.md has "depends: [01-schema.md, 02-models.md]"
# Forge resolves the graph — 01 and 02 run in parallel, 03 waits for both
forge run --spec-dir specs/feature/ "implement all"

# Already-passed specs are skipped automatically; their dependents still run
# Use --force to re-run everything including passed specs
forge run --spec-dir specs/feature/ --force "re-verify all"

Spec-driven development

# 1. Write specs as .md files (see references/writing-specs.md)
# 2. Run them in parallel
forge run --spec-dir ./specs/ "implement all specs"
# 3. Rerun any failures
forge run --rerun-failed "fix failures"
# 4. Check results
forge status

Triage pending specs

# See what's pending
forge specs --pending
# Auto-resolve specs that are already implemented in the codebase
forge specs --check
# Run whatever is still pending
forge run --pending "implement remaining"

Dependency-aware execution

Specs can declare dependencies via YAML frontmatter. Independent specs run in parallel, dependent specs wait:

---
depends: [01-database-schema.md, 02-api-models.md]
---

forge run --spec-dir ./specs/ "implement all"
# Automatically runs in topological order based on depends: declarations

Foundation specs first, then parallelize

When not using depends:, number-prefix specs for ordering. Foundations run sequentially before the parallel phase:

forge run --spec-dir ./specs/ --sequential-first 2 "implement"
# Runs 01-*.md, 02-*.md sequentially, then 03+ in parallel

Audit-then-fix loop

# Manual: audit then run remediation specs
forge audit specs/ -C ~/project                 # Find gaps
forge run --spec-dir specs/audit/ -C ~/project "fix remaining"

# Automated: convergence loop (audit -> fix -> re-audit, up to 3 rounds)
forge audit specs/ --fix -C ~/project
forge audit specs/ --fix --fix-rounds 5         # More rounds if needed

Resume or fork after interruption

forge run --resume <session-id> "continue"               # Pick up where you left off
forge run --fork <session-id> "try different approach"    # Branch from that point

Deep-Dive References

Reference	Load when
writing-specs.md	Writing spec files for forge to execute
parallel-execution.md	Tuning concurrency, understanding cost, monitoring parallel runs