codex-review

Installation
SKILL.md

Codex Code Review Skill

Core Concept: Intention vs Implementation

Running codex review --uncommitted alone only shows AI "what was done (Implementation)". Recording intention first tells AI "what you wanted to do (Intention)".

"Code changes + intention description" as combined input is the most effective way to improve AI code review quality.

Skill Architecture

This skill operates in two phases:

  1. Preparation Phase (current context): Check working directory, update CHANGELOG
  2. Review Phase (isolated context): Invoke Task tool to execute Lint + codex review (using context: fork to reduce context waste)

Execution Steps

0. [First] Check Working Directory Status

git diff --name-only && git status --short

Decide review mode based on output:

  • Has uncommitted changes → Continue with steps 1-4 (normal flow)
  • Clean working directory → Directly invoke codex-runner: codex review --commit HEAD

1. [Mandatory] Check if CHANGELOG is Updated

Before any review, must check if CHANGELOG.md contains description of current changes.

# Check if CHANGELOG.md is in uncommitted changes
git diff --name-only | grep -E "(CHANGELOG|changelog)"

If CHANGELOG is not updated, you must automatically perform the following (don't ask user to do it manually):

  1. Analyze changes: Run git diff --stat and git diff to get complete changes
  2. Auto-generate CHANGELOG entry: Generate compliant entry based on code changes
  3. Write to CHANGELOG.md: Use Edit tool to insert entry at top of [Unreleased] section
  4. Continue review flow: Immediately proceed to next steps after CHANGELOG update

Auto-generated CHANGELOG entry format:

## [Unreleased]

### Added / Changed / Fixed

- Feature description: what problem was solved or what functionality was implemented
- Affected files: main modified files/modules

Example - Auto-generation Flow:

1. Detected CHANGELOG not updated
2. Run git diff --stat, found handlers/responses.go modified (+88 lines)
3. Run git diff to analyze details: added CompactHandler function
4. Auto-generate entry:
   ### Added
   - Added `/v1/responses/compact` endpoint for conversation context compression
   - Supports multi-channel failover and request body size limits
5. Use Edit tool to write to CHANGELOG.md
6. Continue with lint and codex review

2. [Critical] Stage All New Files

Before invoking codex review, must add all new files (untracked files) to git staging area, otherwise codex will report P1 error.

# Check for new files
git status --short | grep "^??"

If there are new files, automatically execute:

# Safely stage all new files (handles empty list and special filenames)
git ls-files --others --exclude-standard -z | while IFS= read -r -d '' f; do git add -- "$f"; done

Explanation:

  • -z uses null character to separate filenames, correctly handles filenames with spaces/newlines
  • while IFS= read -r -d '' reads filenames one by one
  • git add -- "$f" uses -- separator, correctly handles filenames starting with -
  • When no new files exist, loop body doesn't execute, safely skipped
  • This won't stage modified files, only handles new files
  • codex needs files to be tracked by git for proper review

3. Evaluate Task Difficulty and Invoke codex-runner

Count change scale:

# Count number of changed files and lines of code
git diff --stat | tail -1

Difficulty Assessment Criteria:

Model + Reasoning Effort Combinations:

Combination Quality Time Timeout Recommended For
model=gpt-5.2 model_reasoning_effort=xhigh Best ~15-20 min 40 min Critical code, architecture changes
model=gpt-5.3-codex model_reasoning_effort=xhigh High ~8-9 min 15 min Difficult tasks (default)
model=gpt-5.2 model_reasoning_effort=high High ~8-9 min 15 min Alternative for difficult tasks
model=gpt-5.3-codex model_reasoning_effort=high Good ~5-6 min 10 min Normal tasks (default)

Critical Tasks (meets any condition, use best quality model):

  • Modified files ≥ 30
  • Total code changes (insertions + deletions) ≥ 2000 lines
  • Involves core architecture/algorithm changes (user explicitly mentioned)
  • Config: --config model=gpt-5.2 --config model_reasoning_effort=xhigh, timeout 40 minutes

Difficult Tasks (meets any condition):

  • Modified files ≥ 10
  • Total code changes (insertions + deletions) ≥ 500 lines
  • Single metric: insertions ≥ 300 lines OR deletions ≥ 300 lines
  • Cross-module refactoring
  • Default config: --config model=gpt-5.3-codex --config model_reasoning_effort=xhigh, timeout 15 minutes

Normal Tasks (other cases):

  • Default config: --config model=gpt-5.3-codex --config model_reasoning_effort=high, timeout 10 minutes

Evaluation Method:

You MUST parse the git diff --stat output correctly to determine difficulty:

# Get the summary line (last line of git diff --stat)
git diff --stat | tail -1
# Example outputs:
# "20 files changed, 342 insertions(+), 985 deletions(-)"
# "1 file changed, 50 insertions(+)"  # No deletions
# "3 files changed, 120 deletions(-)"  # No insertions

Parsing Rules:

  1. Extract file count from "X file(s) changed" (handle both "1 file" and "N files")
  2. Extract insertions from "Y insertion(s)(+)" if present (handle both "1 insertion" and "N insertions"), otherwise 0
  3. Extract deletions from "Z deletion(s)(-)" if present (handle both "1 deletion" and "N deletions"), otherwise 0
  4. Calculate total changes = insertions + deletions

Important Edge Cases:

  • Single file: "1 file changed" (singular form)
  • No insertions: Git omits "insertions(+)" entirely → treat as 0
  • No deletions: Git omits "deletions(-)" entirely → treat as 0
  • Pure rename: May show "0 insertions(+), 0 deletions(-)" or omit both

Decision Logic (check in order, first match wins):

  • IF file_count >= 30 OR total_changes >= 2000 → Critical (gpt-5.2 + xhigh)
  • IF file_count >= 10 → Difficult (gpt-5.3-codex + xhigh)
  • IF total_changes >= 500 → Difficult (gpt-5.3-codex + xhigh)
  • IF insertions >= 300 OR deletions >= 300 → Difficult (gpt-5.3-codex + xhigh)
  • ELSE → Normal (gpt-5.3-codex + high)

Example Cases:

  • ⭐ "50 files changed, 2000 insertions(+), 1500 deletions(-)" → 关键任务,使用 model=gpt-5.2 model_reasoning_effort=xhigh,超时 40 分钟(核心架构变更)
  • ✅ "20 files changed, 342 insertions(+), 985 deletions(-)" → 困难任务,使用 model=gpt-5.3-codex model_reasoning_effort=xhigh,超时 15 分钟
  • ✅ "5 files changed, 600 insertions(+), 50 deletions(-)" → 困难任务,使用 model=gpt-5.3-codex model_reasoning_effort=xhigh,超时 15 分钟
  • ❌ "3 files changed, 150 insertions(+), 80 deletions(-)" → 普通任务,使用 model=gpt-5.3-codex model_reasoning_effort=high,超时 10 分钟
  • ❌ "1 file changed, 50 insertions(+)" → 普通任务,使用 model=gpt-5.3-codex model_reasoning_effort=high,超时 10 分钟

Invoke codex-runner Subtask:

Use Task tool to invoke codex-runner, passing complete command (including Lint + codex review):

Task parameters:
- subagent_type: Bash
- description: "Execute Lint and codex review"
- timeout: 900000 (15 minutes for difficult tasks) or 600000 (10 minutes for normal tasks)
- prompt: Choose corresponding command based on project type and difficulty

Go project - Difficult task:
  go fmt ./... && go vet ./... && codex review --uncommitted --config model=gpt-5.3-codex --config model_reasoning_effort=xhigh
  (timeout: 900000)

Go project - Normal task:
  go fmt ./... && go vet ./... && codex review --uncommitted --config model=gpt-5.3-codex --config model_reasoning_effort=high
  (timeout: 600000)

Node project - Difficult task:
  npm run lint:fix && codex review --uncommitted --config model=gpt-5.3-codex --config model_reasoning_effort=xhigh
  (timeout: 900000)

Node project - Normal task:
  npm run lint:fix && codex review --uncommitted --config model=gpt-5.3-codex --config model_reasoning_effort=high
  (timeout: 600000)

Python project - Difficult task:
  black . && ruff check --fix . && codex review --uncommitted --config model=gpt-5.3-codex --config model_reasoning_effort=xhigh
  (timeout: 900000)

Python project - Normal task:
  black . && ruff check --fix . && codex review --uncommitted --config model=gpt-5.3-codex --config model_reasoning_effort=high
  (timeout: 600000)

Clean working directory:
  codex review --commit HEAD --config model=gpt-5.3-codex --config model_reasoning_effort=high
  (timeout: 600000)

4. Self-Correction

If Codex finds Changelog description inconsistent with code logic:

  • Code error → Fix code
  • Description inaccurate → Update Changelog

Complete Review Protocol

  1. [GATE] Check CHANGELOG - Auto-generate and write if not updated (leverage current context to understand change intention)
  2. [PREPARE] Stage Untracked Files - Add all new files to git staging area (avoid codex P1 error)
  3. [EXEC] Task → Lint + codex review - Invoke Task tool to execute Lint and codex (isolated context, reduce waste)
  4. [FIX] Self-Correction - Fix code or update description when intention ≠ implementation

Codex Review Command Reference

Basic Syntax

codex review [OPTIONS] [PROMPT]

Note: [PROMPT] parameter cannot be used with --uncommitted, --base, or --commit.

Common Options

Option Description Example
--uncommitted Review all uncommitted changes in working directory (staged + unstaged + untracked) codex review --uncommitted
--base <BRANCH> Review changes relative to specified base branch codex review --base main
--commit <SHA> Review changes introduced by specified commit codex review --commit HEAD
--title <TITLE> Optional commit title, displayed in review summary codex review --uncommitted --title "feat: add JSON parser"
-c, --config <key=value> Override configuration values codex review --uncommitted -c model="o3"

Usage Examples

# 1. Review all uncommitted changes (most common)
codex review --uncommitted

# 2. Review latest commit
codex review --commit HEAD

# 3. Review specific commit
codex review --commit abc1234

# 4. Review all changes in current branch relative to main
codex review --base main

# 5. Review changes in current branch relative to develop
codex review --base develop

# 6. Review with title (title shown in review summary)
codex review --uncommitted --title "fix: resolve JSON parsing errors"

# 7. Review using specific model
codex review --uncommitted -c model="o3"

Important Limitations

  • --uncommitted, --base, --commit are mutually exclusive, cannot be used together
  • [PROMPT] parameter is mutually exclusive with the above three options
  • Must be executed in a git repository directory

Important Notes

  • Ensure execution in git repository directory
  • Timeout automatically adjusted based on task difficulty:
    • Difficult tasks: 15 minutes (timeout: 900000)
    • Normal tasks: 10 minutes (timeout: 600000)
  • codex command must be properly configured and logged in
  • codex automatically processes in batches for large changes
  • CHANGELOG.md must be in uncommitted changes, otherwise Codex cannot see intention description

Design Rationale

Why separate contexts?

  1. CHANGELOG update needs current context: Understanding user's previous conversation and task intention to generate accurate change description
  2. Codex review doesn't need conversation history: Only needs code changes and CHANGELOG, more efficient to run independently
  3. Reduce token consumption: codex review as independent subtask, doesn't carry irrelevant conversation context
Related skills

More from escapewu/skills

Installs
1
Repository
escapewu/skills
GitHub Stars
1
First Seen
Mar 18, 2026