ai-paper-reproduction
SKILL.md
ai-paper-reproduction
Use when
- The user wants Codex to reproduce an AI paper repository.
- The target is a code repository with a README, scripts, configs, or documented commands.
- The goal is a minimal trustworthy run, not unlimited experimentation.
- The user needs standardized outputs that another human or model can audit quickly.
- The task spans more than one stage, such as intake plus setup, or setup plus execution plus reporting.
Do not use when
- The task is a general literature review or paper summary.
- The task is to design a new model, benchmark suite, or training pipeline from scratch.
- The repository is not centered on AI or does not expose a documented reproduction path.
- The user primarily wants a deep code refactor rather than README-first reproduction.
- The user is explicitly asking for only one narrow phase that a sub-skill already covers cleanly.
Success criteria
- README is treated as the primary source of reproduction intent.
- A minimum trustworthy target is selected and justified.
- Documented inference is preferred over evaluation, and evaluation is preferred over training.
- Any repo edits remain conservative, explicit, and auditable.
repro_outputs/is generated with consistent structure and stable machine-readable fields.- Final user-facing explanation is short and follows the user's language when practical.
Interaction and usability policy
- Keep the workflow simple enough for a new user to understand quickly.
- Prefer short, concrete plans over exhaustive research.
- Expose commands, assumptions, blockers, and evidence.
- Avoid turning the skill into an opaque automation layer.
- Preserve a low learning cost for both humans and downstream agents.
Language policy
- Human-readable Markdown outputs should follow the user's language when it is clear.
- If the user's language is unclear, default to concise English.
- Machine-readable fields, filenames, keys, and enum values stay in stable English.
- Paths, package names, CLI commands, config keys, and code identifiers remain unchanged.
See references/language-policy.md.
Reproduction policy
Core priority order:
- documented inference
- documented evaluation
- documented training startup or partial verification
- full training only when the user explicitly asks later
Rules:
- README-first: use repository files to clarify, not casually override, the README.
- Aim for minimal trustworthy reproduction rather than maximum task coverage.
- Treat smoke tests, startup verification, and early-step checks as valid training evidence when full training is not appropriate.
- Record unresolved gaps rather than fabricating confidence.
Patch policy
- Prefer no code changes.
- Prefer safer adjustments first:
- command-line arguments
- environment variables
- path fixes
- dependency version fixes
- dependency file fixes such as
requirements.txtorenvironment.yml
- Avoid changing:
- model architecture
- core inference semantics
- core training logic
- loss functions
- experiment meaning
- If repository files must change:
- create a patch branch first using
repro/YYYY-MM-DD-short-task - apply low-risk changes before medium-risk changes
- avoid high-risk changes by default
- commit only verified groups of changes
- keep verified patch commits sparse, usually
0-2 - use commit messages in the form
repro: <scope> for documented <command>
- create a patch branch first using
See references/patch-policy.md.
Workflow
- Read README and repo signals.
- Call
repo-intake-and-planto scan the repository and extract documented commands. - Select the smallest trustworthy reproduction target.
- Call
env-and-assets-bootstrapto prepare environment assumptions and asset paths. - Run a conservative smoke check or documented command with
minimal-run-and-audit. - Use
paper-context-resolveronly if README and repo files leave a narrow reproduction-critical gap that blocks the current target. - Write the standardized outputs.
- Give the user a short final note in the user's language.
Required outputs
Always target:
repro_outputs/
SUMMARY.md
COMMANDS.md
LOG.md
status.json
PATCHES.md # only if patches were applied
Use the templates under assets/ and the field rules in references/output-spec.md.
Reporting policy
- Put the shortest high-value summary in
SUMMARY.md. - Put copyable commands in
COMMANDS.md. - Put process evidence, assumptions, failures, and decisions in
LOG.md. - Put durable machine-readable state in
status.json. - Put branch, commit, validation, and README-fidelity impact in
PATCHES.mdwhen needed. - Distinguish verified facts from inferred guesses.
Maintainability notes
- Keep this skill narrow: README-first AI repo reproduction only.
- Push specialized logic into sub-skills or helper scripts.
- Prefer stable templates and simple schemas over ad hoc prose.
- Keep machine-readable outputs backward compatible when possible.
- Add new evidence sources only when they improve auditability without raising learning cost.
Weekly Installs
510
Repository
lllllllama/ai-p…on-skillGitHub Stars
1
First Seen
Today
Security Audits
Installed on
opencode510
gemini-cli510
deepagents510
antigravity510
github-copilot510
codex510