adversarial-review
Adversarial Review
Spawn reviewers on the opposite model to challenge work. Reviewers attack from distinct lenses grounded in brain principles. The deliverable is a synthesized verdict — do NOT make changes.
Hard constraint: Reviewers MUST run via the opposite model's CLI (codex exec or
claude -p). Do NOT use subagents, the Agent tool, or any internal delegation mechanism as
reviewers — those run on your own model, which defeats the purpose.
Step 1 — Load Principles
Read brain/principles.md. Follow every [[wikilink]] and read each linked principle file.
These govern reviewer judgments.
Step 2 — Determine Scope and Intent
Identify what to review from context (recent diffs, referenced plans, user message).
Determine the intent — what the author is trying to achieve. This is critical: reviewers challenge whether the work achieves the intent well, not whether the intent is correct. State the intent explicitly before proceeding.
Assess change size:
| Size | Threshold | Reviewers |
|---|---|---|
| Small | < 50 lines, 1–2 files | 1 (Skeptic) |
| Medium | 50–200 lines, 3–5 files | 2 (Skeptic + Architect) |
| Large | 200+ lines or 5+ files | 3 (Skeptic + Architect + Minimalist) |
Read references/reviewer-lenses.md for lens definitions.
Step 3 — Detect Model and Spawn Reviewers
Create a temp directory for reviewer output:
REVIEW_DIR=$(mktemp -d /tmp/adversarial-review.XXXXXX)
Determine which model you are, then spawn reviewers on the opposite:
If you are Claude → spawn Codex reviewers via codex exec:
codex exec --skip-git-repo-check -o "$REVIEW_DIR/skeptic.md" "prompt" 2>/dev/null
Use --profile edit only if the reviewer needs to run tests. Default to read-only.
Run with run_in_background: true, monitor via TaskOutput with block: true, timeout: 600000.
If you are Codex → spawn Claude reviewers via claude CLI:
claude -p "prompt" > "$REVIEW_DIR/skeptic.md" 2>/dev/null
Run with run_in_background: true.
Name each output file after the lens: skeptic.md, architect.md, minimalist.md.
Reviewer prompt template
Each reviewer gets a single prompt containing:
- The stated intent (from Step 2)
- Their assigned lens (full text from references/reviewer-lenses.md)
- The principles relevant to their lens (file contents, not summaries)
- The code or diff to review
- Instructions: "You are an adversarial reviewer. Your job is to find real problems, not validate the work. Be specific — cite files, lines, and concrete failure scenarios. Rate each finding: high (blocks ship), medium (should fix), low (worth noting). Write findings as a numbered markdown list to your output file."
Spawn all reviewers in parallel.
Step 4 — Verify and Synthesize Verdict
Before reading reviewer output, log which CLI was used and confirm the output files exist:
echo "reviewer_cli=codex|claude"
ls "$REVIEW_DIR"/*.md
If any output file is missing or empty, note the failure in the verdict — do not silently skip a reviewer.
Read each reviewer's output file from $REVIEW_DIR/. Deduplicate overlapping findings.
Produce a single verdict:
## Intent
<what the author is trying to achieve>
## Verdict: PASS | CONTESTED | REJECT
<one-line summary>
## Findings
<numbered list, ordered by severity (high → medium → low)>
For each finding:
- **[severity]** Description with file:line references
- Lens: which reviewer raised it
- Principle: which brain principle it maps to
- Recommendation: concrete action, not vague advice
## What Went Well
<1–3 things the reviewers found no issue with — acknowledge good work>
Verdict logic:
- PASS — no high-severity findings
- CONTESTED — high-severity findings but reviewers disagree on them
- REJECT — high-severity findings with reviewer consensus
Step 5 — Render Judgment
After synthesizing the reviewers, apply your own judgment. Using the stated intent and brain principles as your frame, state which findings you would accept and which you would reject — and why. Reviewers are adversarial by design; not every finding warrants action. Call out false positives, overreach, and findings that mistake style for substance.
Append to the verdict:
## Lead Judgment
<for each finding: accept or reject with a one-line rationale>
More from compozy/kb
kb
Comprehensive skill for the `kb` CLI and the Karpathy Knowledge Base pattern. Covers the full KB lifecycle — topic scaffolding, multi-source ingestion (URLs, files, YouTube, bookmarks, codebases), wiki article compilation, cross-article querying with file-back, lint-and-heal passes, QMD indexing, and hybrid search. Also covers codebase-specific analysis via inspect commands for complexity, coupling, blast radius, dead code, circular dependencies, symbol/file lookups, backlinks, and code smells. Use when working with kb CLI commands, knowledge base workflows, code vault generation, code graph analysis, code metrics inspection, wiki compilation, or the ingest-compile-query-lint cycle. Do not use for general code review, linting, formatting, building Go projects, or writing application code.
14systematic-qa
Executes full-project QA like a real user by discovering the repository verification contract, running build, lint, test, and startup commands, exercising core workflows end-to-end, creating realistic fixtures when needed, fixing root-cause regressions, and rerunning the full gate. Use when validating a branch, release candidate, migration, refactor, or risky commit. Do not use for static code review only, one-off unit test edits, or architecture brainstorming without execution.
1cy-create-tasks
Decomposes PRDs and TechSpecs into detailed, independently implementable task files with enrichment from codebase exploration. Use when a PRD or TechSpec exists and needs to be broken down into executable tasks, or when task files need enrichment with implementation context. Do not use for PRD creation, TechSpec generation, or direct task execution.
1find-skills
Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.
1lesson-learned
Analyze recent code changes via git history and extract software engineering lessons. Use when the user asks 'what is the lesson here?', 'what can I learn from this?', 'engineering takeaway', 'what did I just learn?', 'reflect on this code', or wants to extract principles from recent work.
1git-rebase
Intelligently handle git rebase operations and resolve merge conflicts while preserving features and maintaining code quality. Use when rebasing feature branches, resolving conflicts across commits, and ensuring clean linear history without losing changes.
1