forge
Forge
Delegate complex, multi-step development work to an autonomous agent that builds and verifies code.
When to Use Forge
- The task targets a different repo (
-C ~/other-project) - The work is complex enough to benefit from autonomous agent execution with verification
- You have spec files describing outcomes to implement
- You want to run multiple specs in parallel
Running Forge Commands
Forge uses the Agent SDK internally. SDK-invoking commands cannot run inside Claude Code (nested SDK restriction). The CLI will block with a clear error if you try.
SDK commands (run, audit, define, review, proof, verify, specs --check): Build the command and present it to the user. Do NOT execute via Bash.
The forge command to run:
forge run --spec-dir specs/ "implement all"
Run this in your forge tmux pane. I can run `forge watch` here for live progress.
Non-SDK commands (specs, status, stats, watch): Run directly via Bash — these are safe, read-only operations.
After presenting an SDK command, offer to run forge watch for live tailing (it's read-only and works inside Claude Code).
Commands
forge run
forge run "add auth middleware" # Simple task
forge run --spec specs/auth.md "implement this" # With spec file
forge run --spec-dir ./specs/ "implement all" # Parallel specs (default)
forge run -C ~/other-repo "fix the login bug" # Target different repo
forge run --rerun-failed "fix failures" # Rerun failed specs
forge run --pending "implement pending" # Run only pending specs
forge run --resume <session-id> "continue" # Resume interrupted session
forge run --plan-only "design API for auth" # Plan without implementing
forge "quick task" # Shorthand (no 'run')
Important flags:
-s, --spec <path>-- Spec file (shorthand resolves via manifest). Prompt becomes additional context.-S, --spec-dir <path>-- Directory of specs (shorthand resolves via known dirs). Runs each.mdin parallel by default. Already-passed specs are skipped.--sequential-- Run specs sequentially instead of parallel (default: parallel).-F, --force-- Re-run all specs including already passed.-B, --branch <name>-- Run in an isolated git worktree on the named branch. Auto-commits on success, cleans up after.--concurrency <n>-- Override auto-detected parallelism (default: freeMem/2GB, capped at CPUs).--sequential-first <n>-- Run first N specs sequentially, then parallelize.-C, --cwd <path>-- Target repo directory.-t, --max-turns <n>-- Max turns per spec (default: 250).-b, --max-budget <usd>-- Max budget in USD per spec.--plan-only-- Create tasks without implementing.--dry-run-- Preview tasks and estimate cost without executing.-v, --verbose-- Full output detail.-q, --quiet-- Suppress progress output (for CI).-w, --watch-- Auto-split tmux pane with live logs.
forge audit
Reviews codebase against specs. Produces new spec files for remaining work — feed them back into forge run --spec-dir.
forge audit specs/ # Audit all specs in directory
forge audit specs/auth.md # Audit a single spec file
forge audit auth.md # Shorthand (resolves via manifest)
forge audit specs/ "focus on auth module" # With additional context
forge audit specs/ -o ./remediation/ # Custom output dir
forge audit specs/ -C ~/target-repo # Different repo
forge audit specs/ --watch # Auto-split tmux pane with live logs
forge audit specs/ --fix # Audit-fix loop (audit -> fix -> re-audit)
forge audit specs/ --fix --fix-rounds 5 # Custom max rounds (default: 3)
forge define
Analyzes codebase and generates outcome spec files from a high-level description. Closes the loop: forge define → forge specs → forge <spec-dir> "implement".
forge define "build auth system" # Generate specs in specs/
forge define "add rate limiting" -o specs/api/ # Custom output dir
forge define "refactor database" -C ~/project # Different repo
forge review
Reviews recent git changes for bugs and quality issues.
forge review # Review main...HEAD
forge review HEAD~5...HEAD # Specific range
forge review --dry-run -o findings.md # Report only, write to file
forge review -C ~/other-repo # Different repo
forge proof
Generates real test files from implemented specs. Reads specs + codebase, writes .test.ts files colocated with source (or in the project's test directory), a manual.md human checklist, and a manifest.json. Auto-detects test convention and framework. Supports multiple spec paths. forge prove is a backward-compatible alias.
forge proof specs/feature.md # Single spec proof
forge proof specs/ # All specs in directory
forge proof specs/a.md specs/b.md specs/c.md # Multiple specific specs
forge proof specs/ -o ./custom-proofs/ # Custom manifest output dir
forge proof specs/ -C ~/other-repo # Different repo
forge pipeline
Chains define → run → audit → proof → verify into a single automated flow with observable gates. The pipeline process stays alive and polls for gate changes — TUI/MCP approve gates by writing state, not by spawning processes.
forge pipeline "build auth system" # Full pipeline
forge pipeline --from run --spec-dir specs/ "go" # Start at run with existing specs
forge pipeline --gate-all confirm "careful build" # Pause at every gate
forge pipeline --resume <pipeline-id> # Resume paused/failed pipeline
forge pipeline status # Show current pipeline state
Gates default to: auto (define→run, run→audit, proof→verify), confirm (audit→proof). TUI controls: a advance, s skip, p pause, c cancel.
forge watch
Live-tail session logs with colored output. Auto-follows to next session during batch runs; exits after final session or 60s timeout.
forge watch # Watch latest session (auto-follows batch)
forge watch <session-id> # Watch specific session (no auto-follow)
forge watch -C ~/other-repo # Watch in different repo
forge status
forge status # Latest run
forge status --all # All runs
forge status -n 5 # Last 5 runs
forge status -C ~/other-repo # Different repo
forge stats
Aggregate run statistics across all results.
forge stats # Dashboard: runs, cost, success rate
forge stats --by-spec # Per-spec breakdown from manifest
forge stats --by-model # Per-model breakdown
forge stats --since 2026-03-01 # Filter runs after date
forge stats -C ~/other-repo # Different repo
forge specs
List tracked specs with lifecycle status. Specs are registered in .forge/specs.json as they're run.
forge specs # List all tracked specs
forge specs --pending # Show only pending
forge specs --failed # Show only failed
forge specs --passed # Show only passed
forge specs --orphaned # Manifest entries with missing files
forge specs --untracked # .md files not in manifest
forge specs --add # Register all untracked specs
forge specs --add specs/new.md # Register specific spec by path/glob
forge specs --resolve game.md # Mark spec as passed without running
forge specs --unresolve game.md # Reset a spec back to pending
forge specs --check # Triage pending specs: auto-resolve already-implemented ones via Sonnet agent
forge specs --reconcile # Backfill from .forge/results/ history
forge specs --prune # Remove orphaned entries from manifest
forge specs --summary # Directory-level roll-up (compact view)
Important
Never manually orchestrate parallel forge runs (e.g. forge run spec1.md & forge run spec2.md & wait). Forge handles parallelism, dependency ordering, and skip-passed internally via --spec-dir. Manual orchestration bypasses the dependency graph, manifest tracking, and batch grouping.
Always prefer --spec-dir over running individual specs. It automatically:
- Skips already-passed specs (use
--forceto override) - Resolves
depends:frontmatter into a topological execution order - Tracks all specs in a single batch with grouped cost reporting
- Auto-tunes concurrency based on available memory
--spec takes exactly one file. The prompt is always the last positional argument (a quoted string). Bare file paths without a flag are interpreted as the prompt, not as spec files. To run multiple specs, use --spec-dir.
Shorthand resolution: spec paths resolve automatically. forge run --spec login.md finds the spec via manifest lookup. forge run --spec-dir gtmeng-580 finds .bonfire/specs/gtmeng-580/. Full paths always work too.
Pipeline is autonomous end-to-end. forge pipeline (or forge_pipeline_start via MCP) runs define → run → audit → proof → verify automatically. When you start a pipeline:
- Wait for it to complete by polling with
forge_task. Do NOT commit, push, or create PRs while the pipeline is still running — it owns the repo during execution. - Do NOT run individual stages (e.g.
forge audit,forge proof) after a pipeline — it already ran them. - Do NOT commit, push, or create PRs while a pipeline is running — the verify stage creates a PR automatically when the pipeline completes.
- Only use individual commands when you are NOT using pipeline.
Common Mistakes
# WRONG: bare paths without --spec are treated as the prompt string
forge run specs/auth.md specs/login.md
# WRONG: --spec only takes one file, not multiple
forge run --spec specs/auth.md specs/login.md "implement"
# WRONG: manually running specs one-by-one bypasses dependency ordering
forge run --spec specs/01-schema.md "implement" && forge run --spec specs/02-api.md "implement"
# RIGHT: put specs in a directory, use --spec-dir
forge run --spec-dir specs/ "implement all"
# RIGHT: single spec with --spec
forge run --spec specs/auth.md "implement this"
Recipes
Run all specs in a directory
forge run --spec-dir gtmeng-580 -C ~/dev/project "implement all"
# Shorthand paths resolve automatically (gtmeng-580 → .bonfire/specs/gtmeng-580/)
# Already-passed specs are skipped; deps on passed specs are treated as satisfied
Run a subset of specs from a directory
If specs use depends: frontmatter, --spec-dir automatically runs only the ready ones in topological order. Dependent specs wait for their deps to pass — no need to cherry-pick individual specs.
# Specs declare their dependencies:
# 03-api.md has "depends: [01-schema.md, 02-models.md]"
# Forge resolves the graph — 01 and 02 run in parallel, 03 waits for both
forge run --spec-dir specs/feature/ "implement all"
# Already-passed specs are skipped automatically; their dependents still run
# Use --force to re-run everything including passed specs
forge run --spec-dir specs/feature/ --force "re-verify all"
Spec-driven development
# 1. Write specs as .md files (see references/writing-specs.md)
# 2. Run them in parallel
forge run --spec-dir ./specs/ "implement all specs"
# 3. Rerun any failures
forge run --rerun-failed "fix failures"
# 4. Check results
forge status
Triage pending specs
# See what's pending
forge specs --pending
# Auto-resolve specs that are already implemented in the codebase
forge specs --check
# Run whatever is still pending
forge run --pending "implement remaining"
Dependency-aware execution
Specs can declare dependencies via YAML frontmatter. Independent specs run in parallel, dependent specs wait:
---
depends: [01-database-schema.md, 02-api-models.md]
---
forge run --spec-dir ./specs/ "implement all"
# Automatically runs in topological order based on depends: declarations
Foundation specs first, then parallelize
When not using depends:, number-prefix specs for ordering. Foundations run sequentially before the parallel phase:
forge run --spec-dir ./specs/ --sequential-first 2 "implement"
# Runs 01-*.md, 02-*.md sequentially, then 03+ in parallel
Audit-then-fix loop
# Manual: audit then run remediation specs
forge audit specs/ -C ~/project # Find gaps
forge run --spec-dir specs/audit/ -C ~/project "fix remaining"
# Automated: convergence loop (audit -> fix -> re-audit, up to 3 rounds)
forge audit specs/ --fix -C ~/project
forge audit specs/ --fix --fix-rounds 5 # More rounds if needed
Resume or fork after interruption
forge run --resume <session-id> "continue" # Pick up where you left off
forge run --fork <session-id> "try different approach" # Branch from that point
Deep-Dive References
| Reference | Load when |
|---|---|
| writing-specs.md | Writing spec files for forge to execute |
| parallel-execution.md | Tuning concurrency, understanding cost, monitoring parallel runs |