codex-code-review
Codex Code Review
Trigger
- Keywords: review, PR, code review, second opinion, audit, check
When NOT to Use
- Document review (use
doc-review) - Security-specific review (use
security-review) - Test coverage review (use
test-review) - Just want to understand code (use
code-explore)
Variants
| Variant | Command | Scope | Pre-checks |
|---|---|---|---|
| Fast | /codex-review-fast |
Diff only | None |
| Full | /codex-review |
Diff + local checks | lint:fix + build |
| Branch | /codex-review-branch |
Full branch | None |
Shared Workflow
Step 0 (PENDING) → Collect changes → [Pre-checks if Full] → Dual Review (Codex + Task) → Await Results → Aggregate → Emit Gate → Loop if Blocked
Step 0: Dual Review Init (Fail-closed)
Execute: bash scripts/emit-review-gate.sh PENDING
This sets review_mode=dual and aggregate_gate.executed=false in state file, ensuring fail-closed semantics — if the process crashes before Step 4.5, stop-guard blocks.
Step 1: Collect Change Metadata
Collect metadata only — Codex reads the actual diffs and file contents itself via sandbox access.
| Variant | Collection Method |
|---|---|
| Fast | CHANGED_FILES: git diff --name-only HEAD + DIFF_STAT: git diff --stat HEAD |
| Full | Same as Fast |
| Branch | Same + CURRENT_BRANCH + BASE_BRANCH + COMMIT_COUNT |
Codex independently reads full diffs and file contents via git diff HEAD -- <file> + cat (per research instructions).
Step 1.5: Feature Context & AC Detection (Spec-Driven Review)
Execute: bash scripts/resolve-feature.sh → parse JSON output.
| Field | Use |
|---|---|
has_requests |
Gate: only proceed if true |
docs_path |
Glob for request docs |
confidence |
Require >= medium |
If has_requests=true AND confidence in (high, medium):
- Glob
${docs_path}/requests/*.md, sort descending, take latest - Read latest request doc
- Extract
## Acceptance Criteriasection (parse- [ ]/- [x]items) - Filter out quality-gate ACs matching:
/codex-review-fast,/codex-review-doc,/codex-review,/precommit,/precommit-fast,/pr-review - Cap: max 20 ACs (truncate with "... and N more" note)
- Build
SPEC_CHECKLISTvariable, setREQUEST_DOC_PATH
Graceful degradation: resolve-feature fails / no requests / no AC section / parse error → SPEC_CHECKLIST = null (skip silently).
Step 1.6: Deferred Finding Context
Prerequisite: Requires R4 (Nit History Persistence) hook-side write path. Until R4 is deployed,
.claude_nit_history.jsonwill not exist and this step is a no-op (graceful degradation).
Read .claude_nit_history.json (if exists):
- Filter
.deferred[]entries wherelast_seen + ttl_days> now - Sanitize each entry before injection (mandatory):
canonical_issue<= 120 chars (truncate)- Strip markdown control chars (
**, backticks,#,>,|) - No raw code snippets; only file:line references
- No secrets/tokens/passwords/API keys (per
rules/security.md) - Reject entries with shell metacharacters (
;,&,|, backtick,$()
- Format as
<deferred_context>XML block (max 10 entries) - Store in
DEFERRED_CONTEXTvariable
Inject into all prompt variants after research instructions, before Review Dimensions:
${DEFERRED_CONTEXT ? DEFERRED_CONTEXT : ''}
Format:
<deferred_context>
Previously deferred (do not re-report without new evidence):
- [Nit] src/service.ts | naming convention (deferred 2x)
- [P2] src/utils.ts | error handling pattern (deferred 1x)
</deferred_context>
Graceful degradation: file missing / invalid JSON / no entries / sanitization failure → DEFERRED_CONTEXT = null (skip silently).
Step 2: Pre-checks (Full variant only)
{LINT_FIX_COMMAND}
{BUILD_COMMAND}
These placeholders are resolved from the host project's CLAUDE.md or package.json scripts. Record results as LOCAL_CHECKS.
Step 3: Dual Review (Parallel Dispatch)
Case A: First review (no --continue)
Launch two reviewers in parallel (single message, multiple tool calls):
-
Codex MCP (primary): Use
mcp__codex__codexwith variant-specific prompt:Variant Prompt Template Fast references/codex-prompt-fast.mdFull references/codex-prompt-full.mdBranch references/codex-prompt-branch.mdConfig:
sandbox: 'read-only',approval-policy: 'never'Save the returned
threadId. -
Secondary reviewer: Use
Tasktool with reviewer selection cascade:Priority Reviewer subagent_type Condition 1 pr-review-toolkit:code-reviewerpr-review-toolkit:code-reviewerDefault choice 2 strict-reviewerstrict-reviewerPriority 1 fails/times out 3 Codex-only (degraded) — Both unavailable Selection: Try priority 1 first. If Task fails or times out (30s), try priority 2. If both unavailable, fall back to Codex-only (degraded mode — proceed with Codex results only, apply degradation matrix from
references/review-common.md).Task prompt (provide changed file list + diff stats, request P0/P1/P2/Nit findings in standard output format):
Review the code changes for correctness, security, performance, and maintainability issues. ## Changed Files <git diff --name-only output> ## Diff Stats <git diff --stat output> Read the actual diffs and file contents yourself to perform the review. Before reporting findings, independently verify each one: 1. Evidence check: what specific code proves it's real? (file:line) 2. Context check: did you read enough surrounding code? 3. False positive check: could it be intentional design? 4. Severity check: could it be more severe than initially assessed? 5. Gap check: what related issues might you have overlooked? Only report findings that survive all 5 checks. Output findings in this format: - [P0/P1/P2/Nit] file:line issue description → fix recommendation Group by severity. Include a final gate: ✅ Ready (no P0/P1) or ⛔ Blocked (has P0/P1).
Case B: Loop review (has --continue)
- Codex: Use
mcp__codex__codex-replywith re-review template fromreferences/review-common.md - Secondary: Re-dispatch in parallel (same mechanism as first pass, fresh context). Always dispatched in v1 — no skip exception. Cycle resets on any code edit.
Step 3.5: Await Codex + Reconcile Secondary
Codex is the blocking reviewer — await its result for the initial gate. Secondary runs in background (run_in_background: true) and is non-blocking:
| Secondary Status | Action |
|---|---|
| Completed before Codex | Include in aggregation (Step 4) |
| Completed after Codex, before precommit | Reconcile at pre-precommit checkpoint |
| Still running at precommit | Proceed with Codex gate (authoritative); if late result has P0/P1, re-open fix→re-review loop |
| Failed/timed out | Apply degradation matrix per references/review-common.md § Dual Reviewer Aggregation |
Step 4: Consolidate Output (Dual Mode)
-
Normalize both sets of findings to unified format:
[severity] file:line description → fix- Codex findings: already in standard format
- toolkit findings: apply Severity Mapping (see
references/review-common.md § Severity Mapping) - strict-reviewer findings: already use P0/P1/P2/Nit
-
Deduplicate using key =
file + canonical_issue_text(ignore line ±5 difference)- Same key → keep highest severity (P0 > P1 > P2 > Nit)
-
Tag source:
source = codex | toolkit | both -
Sort: P0 → P1 → P2 → Nit
-
Gate decision: any P0/P1 → BLOCKED; else → READY
Output format includes source tag:
- [P0] file:line issue → fix [source: both]
- [P1] file:line issue → fix [source: codex]
Step 4.5: Emit Review Gate
Execute: bash scripts/emit-review-gate.sh READY or bash scripts/emit-review-gate.sh BLOCKED
This updates aggregate_gate.executed=true and aggregate_gate.gate in the state file.
Then output the standard gate sentinel:
✅ Ready— if READY (no P0/P1)⛔ Blocked— if BLOCKED (has P0/P1)
Shared Definitions
See references/review-common.md for:
- Severity levels (P0/P1/P2/Nit)
- Review dimensions
- Merge gate definitions
- Re-review prompt template
- Gate sentinels (hook + behavior-layer)
- Dual Reviewer Aggregation (severity mapping, deduplication, degradation matrix, source attribution)
Review Loop
⚠️ @CLAUDE.md auto-loop: fix → re-review → ... → ✅ PASS ⚠️
Blocked → fix P0/P1 → /codex-review-fast --continue <threadId> → repeat until Ready.
Ready + P2/Nit → batch fix → 1 Codex --continue verify → evaluate (see rules/auto-loop.md P2/Nit Quality Sweep).
3 rounds on same issue → report blocker, request intervention.
Dual Mode Loop Behavior
| Reviewer | Loop Behavior |
|---|---|
| Codex MCP | Stateful → mcp__codex__codex-reply(threadId) continues context |
| Secondary | Re-dispatched every iteration (fresh context). Always dispatched in v1 (no skip exception). |
Codex gate is authoritative for timing. Secondary runs non-blocking in background. Aggregation reconciled at pre-precommit checkpoint. Any code edit resets the review cycle — both reviewers must re-run.
Pre-precommit Checkpoint
Before triggering /precommit, reconcile any pending secondary result:
| Condition | Action |
|---|---|
| Task completed + has P0/P1 | Re-emit BLOCKED → fix → re-review (Codex --continue + Secondary fresh) |
| Task completed + no P0/P1 | Union aggregate → proceed to precommit |
| Task still running | Proceed with Codex gate (authoritative); if late result has P0/P1, re-open fix→re-review loop |
Verification
- Each issue tagged with severity (P0/P1/P2/Nit)
- Gate is clear (✅ Ready / ⛔ Blocked)
- Issues include: file:line, description, fix suggestion
- Codex performed independent project research
- Branch variant: dimension rating table included
References
- Shared definitions:
references/review-common.md - Fast prompt:
references/codex-prompt-fast.md - Full prompt:
references/codex-prompt-full.md - Branch prompt:
references/codex-prompt-branch.md - Research instructions:
references/codex-research-instructions.md
Examples
Input: /codex-review-fast
Action: emit PENDING → git diff → Codex + Task(code-reviewer) parallel → aggregate → emit gate → P0/P1/P2/Nit + Gate
Input: /codex-review --focus "auth"
Action: emit PENDING → lint:fix → build → git diff → Codex + Task parallel (focus: auth) → aggregate → emit gate
Input: /codex-review-branch origin/develop
Action: emit PENDING → branch diff + history → Codex + Task parallel → aggregate → emit gate → Rating table + Findings + Gate
Input: /codex-review-fast (Codex unavailable)
Action: emit PENDING → git diff → Task(code-reviewer) only → degraded aggregate → emit gate + ⚠️ warning
More from sd0xdev/sd0x-dev-flow
request-tracking
Request tracking knowledge base. Use when: querying request status, managing document references, tracking progress. Not for: creating requests (use create-request), tech specs (use tech-spec). Output: status report + progress tracking.
6bug-fix
Bug fix workflow. Use when: fixing bugs, resolving issues, regression fixes. Not for: new features (use feature-dev), understanding code (use code-explore). Output: fix + regression test + review gate.
6claude-health
Claude Code config health check + plugin sync. Use when: auditing .claude/ structure, checking naming, verifying hook setup, detecting plugin version drift, syncing installed assets. Not for: skill quality (use skill-health-check), code review (use codex-code-review). Output: health report + fix recommendations.
6pr-summary
List open PRs, filter automation PRs, group by ticket ID, format as Markdown. Use when: user asks for PR summary, PR status, or /pr-summary
6git-profile
Git identity and GPG signing profile manager. Discovers profiles from GPG + git config, switches local identity, checks signing health. Use when: user says 'switch identity', 'git profile', 'check signing', 'gpg key status', 'which identity', or /git-profile
6risk-assess
Uncommitted code risk assessment with breaking change detection, blast radius analysis, and scope metrics. Use when: evaluating PR risk, pre-commit risk check, large refactoring review. Not for: security vulnerabilities (use /codex-security), code correctness (use /codex-review-fast). Output: 3-dimension weighted score + risk level + gate.
6