harness-engineering-playbook
SKILL.md
Harness Engineering Playbook
Use this skill to operationalize the practices from OpenAI's Harness Engineering guide in a repo that agents can run against repeatedly and safely.
What To Load
- Use
references/openai-harness-practices.mdfor the full practice-to-artifact mapping. - Use
references/rollout-checklist.mdfor phased adoption in active repos. - Use
references/wizard-cli.mdfor Typer wizard command flows. - Use
assets/templates/when creating or updating harness files.
Inputs
- Target repository path.
- Existing command surface (
make,npm,cargo,pytest, etc.). - Existing CI workflows and branch protections.
Workflow
- Baseline the repo and detect existing workflows.
- Bootstrap harness artifacts and templates.
- Apply all nine Harness Engineering practices.
- Run harness audit checks and repair gaps.
- Iterate after real agent runs.
Step 1: Baseline The Repo
- Identify language/toolchain and canonical entrypoints.
- Inventory existing checks, scripts, and CI jobs.
- Record current pain points for agent runs: setup drift, unclear docs, flaky tests, missing trace IDs, slow loops.
Use a short baseline note inside PLANS.md so decisions remain durable.
Step 2: Bootstrap Harness Artifacts
Preferred entrypoint:
python3 scripts/harness_wizard.py init <repo-path> --profile control
Profiles:
baseline: only core harness artifacts.control: baseline + control-system primitives.full: control + entropy controls (nightly audit + entropy checks).
Direct shell fallback:
Run:
./scripts/bootstrap_harness.sh <repo-path>
This script installs safe defaults from assets/templates/:
AGENTS.mdPLANS.mddocs/ARCHITECTURE.mddocs/OBSERVABILITY.mdMakefile.harness(+-include Makefile.harnessinMakefile)scripts/audit_harness.shscripts/harness/{smoke,test,lint,typecheck}.sh.github/workflows/harness.yml
By default, existing files are not overwritten. Pass --force to replace template-managed files.
Step 3: Apply The Nine Practices
Implement each practice directly in repo artifacts.
1. Make Easy To Do Hard Thing
- Ensure hard, high-value tasks are one command away (
make smoke,make check,make ci). - Keep setup and cleanup scripted.
- Make smoke checks cheap enough for frequent use.
2. Communicate Actionable Constraints With Compact Docs
- Keep
AGENTS.mdshort, concrete, and command-first. - Document non-obvious constraints and guardrails.
- Keep docs close to code and update with behavior changes.
3. Structure Codebase With Strict Boundaries And Flow
- Define module boundaries in
docs/ARCHITECTURE.md. - Parse and validate data at boundaries; use typed contracts for internal flow.
- Prefer one abstraction per module and one clear ownership path.
4. Build Observability In From Day 1
- Emit structured logs/events with correlation IDs.
- Capture key transitions in long-running workflows.
- Define minimum observable fields in
docs/OBSERVABILITY.md.
5. Optimize For Agent Flow, Not Human Flow
- Treat context as a first-class system dependency.
- Use
PLANS.mdfor multi-step/multi-hour tasks. - Front-load durable context (scope, constraints, checkpoints) so restarts stay cheap.
6. Bring Your Own Harness
- Standardize repo-local wrappers (
Makefile.harness,scripts/harness/). - Wrap local infra actions in deterministic scripts.
- Make agent behavior reproducible across machines and runs.
7. Prototype In Natural Language First
- Draft logic and tests in prose before coding.
- Review edge cases in prose and lock acceptance criteria.
- Translate approved prose into code and tests.
8. Invest In Static Analysis And Linting
- Pin formatter/linter/typechecker versions where practical.
- Enforce checks in both local workflow and CI.
- Run static checks before long tests to shorten failure loops.
9. Manage Entropy
- Add periodic audits for docs drift, flaky checks, and dead scripts.
- Keep templates synchronized with real workflows.
- Remove stale abstractions quickly to keep agent context clean.
For a detailed artifact matrix, load references/openai-harness-practices.md.
Step 4: Validate
Run:
python3 scripts/harness_wizard.py audit <repo-path>
Treat any MISSING or FAIL result as blocking before calling harness setup complete.
Step 5: Iterate On Real Runs
- Observe one full agent run from clean checkout to merged change.
- Patch harness gaps immediately.
- Re-run audit.
- Keep
AGENTS.md,PLANS.md, and architecture docs aligned with current behavior.
Adaptation Rules
- Preserve existing project conventions and replace templates incrementally.
- Do not overwrite user-authored files without explicit approval.
- Keep command names stable; change internals behind wrappers.
- Favor deterministic, scriptable workflows over ad-hoc interactive steps.
Weekly Installs
35
Repository
broomva/harness…ng-skillGitHub Stars
10
First Seen
Feb 17, 2026
Security Audits
Installed on
github-copilot35
codex35
opencode34
cursor34
gemini-cli34
kimi-cli32