research-review

SKILL.md

Research Review via Codex MCP (xhigh reasoning)

Get a multi-round critical review of research work from an external LLM with maximum reasoning depth.

Constants

  • REVIEWER_MODEL = gpt-5.4 — Model used via Codex MCP. Must be an OpenAI model (e.g., gpt-5.4, o3, gpt-4o)

Context: $ARGUMENTS

Prerequisites

  • Codex MCP Server configured in Claude Code:
    claude mcp add codex -s user -- codex mcp-server
    
  • This gives Claude Code access to mcp__codex__codex and mcp__codex__codex-reply tools

Workflow

Step 1: Gather Research Context

Before calling the external reviewer, compile a comprehensive briefing:

  1. Read project narrative documents (e.g., STORY.md, README.md, paper drafts)
  2. Read any memory/notes files for key findings and experiment history
  3. Identify: core claims, methodology, key results, known weaknesses

Step 2: Initial Review (Round 1)

Send a detailed prompt with xhigh reasoning:

mcp__codex__codex:
  config: {"model_reasoning_effort": "xhigh"}
  prompt: |
    [Full research context + specific questions]
    Please act as a senior ML reviewer (NeurIPS/ICML level). Identify:
    1. Logical gaps or unjustified claims
    2. Missing experiments that would strengthen the story
    3. Narrative weaknesses
    4. Whether the contribution is sufficient for a top venue
    Please be brutally honest.

Step 3: Iterative Dialogue (Rounds 2-N)

Use mcp__codex__codex-reply with the returned threadId to continue the conversation:

For each round:

  1. Respond to criticisms with evidence/counterarguments
  2. Ask targeted follow-ups on the most actionable points
  3. Request specific deliverables: experiment designs, paper outlines, claims matrices

Key follow-up patterns:

  • "If we reframe X as Y, does that change your assessment?"
  • "What's the minimum experiment to satisfy concern Z?"
  • "Please design the minimal additional experiment package (highest acceptance lift per GPU week)"
  • "Please write a mock NeurIPS/ICML review with scores"
  • "Give me a results-to-claims matrix for possible experimental outcomes"

Step 4: Convergence

Stop iterating when:

  • Both sides agree on the core claims and their evidence requirements
  • A concrete experiment plan is established
  • The narrative structure is settled

Step 5: Document Everything

Save the full interaction and conclusions to a review document in the project root:

  • Round-by-round summary of criticisms and responses
  • Final consensus on claims, narrative, and experiments
  • Claims matrix (what claims are allowed under each possible outcome)
  • Prioritized TODO list with estimated compute costs
  • Paper outline if discussed

Update project memory/notes with key review conclusions.

Key Rules

  • ALWAYS use config: {"model_reasoning_effort": "xhigh"} for reviews
  • Send comprehensive context in Round 1 — the external model cannot read your files
  • Be honest about weaknesses — hiding them leads to worse feedback
  • Push back on criticisms you disagree with, but accept valid ones
  • Focus on ACTIONABLE feedback — "what experiment would fix this?"
  • Document the threadId for potential future resumption
  • The review document should be self-contained (readable without the conversation)

Prompt Templates

For initial review:

"I'm going to present a complete ML research project for your critical review. Please act as a senior ML reviewer (NeurIPS/ICML level)..."

For experiment design:

"Please design the minimal additional experiment package that gives the highest acceptance lift per GPU week. Our compute: [describe]. Be very specific about configurations."

For paper structure:

"Please turn this into a concrete paper outline with section-by-section claims and figure plan."

For claims matrix:

"Please give me a results-to-claims matrix: what claim is allowed under each possible outcome of experiments X and Y?"

For mock review:

"Please write a mock NeurIPS review with: Summary, Strengths, Weaknesses, Questions for Authors, Score, Confidence, and What Would Move Toward Accept."

Weekly Installs
5
GitHub Stars
814
First Seen
3 days ago
Installed on
github-copilot5
codex5
kimi-cli5
gemini-cli5
cursor5
amp5