release-code
Release Code
Prepare a research code repository for public release: audit for security issues, generate missing assets (README, citation, license), tag the version, and create a GitHub release.
Release uses project toolchain gates as evidence, not as decoration. Prefer non-mutating checks first. Run mutating format/fix commands only when requested or required by project policy, then review the diff.
Skill Directory Layout
<installed-skill-dir>/
├── SKILL.md
├── checklist.md # Detailed item-by-item checklist reference
└── templates/
├── README_ml_paper.md # README template for ML paper repos
└── CITATION.cff # Citation file template
Steps to Follow
1. Understand the release context
Ask the user in a single message:
- Paper info: Title, authors, venue/year (e.g. "CVPR 2025"), arXiv ID or URL (if available)
- Release type: Full code release / minimal reproducibility package / anonymous conference submission?
- License: MIT / Apache-2.0 / CC-BY-4.0 / other? (default: MIT for code, CC-BY-4.0 for models/data)
- What's included: Training code? Evaluation? Pre-trained weights? Datasets? Demo?
- Version tag: e.g.
v1.0.0(default:v1.0.0for first release) - GitHub release?: Create a GitHub release with release notes? (yes/no)
- Model weights: Are pre-trained weights available? If so, where will they be hosted? (HuggingFace Hub / Google Drive / project page)
2. Audit the repository
Run the following checks (read output silently, report summary to user):
2a. Security scan — secrets and credentials
# Check for common secret patterns
git -C "$(git rev-parse --show-toplevel)" log --all --oneline | head -5
grep -rI --include="*.py" --include="*.yaml" --include="*.json" --include="*.env" \
-E "(password|secret|api_key|token|credential|private_key)\s*=\s*['\"][^'\"]{6}" \
"$(git rev-parse --show-toplevel)" 2>/dev/null | head -20
# Prefer dedicated scanners when available
command -v gitleaks >/dev/null && gitleaks dir --no-banner --redact "$(git rev-parse --show-toplevel)"
command -v detect-secrets >/dev/null && detect-secrets scan "$(git rev-parse --show-toplevel)"
2b. Large files check
# Find large files that shouldn't be in the repo
find "$(git rev-parse --show-toplevel)" -type f -size +50M \
! -path "*/.git/*" ! -path "*/outputs/*" ! -path "*/wandb/*" 2>/dev/null
2c. Sensitive file patterns
# Check for files that should not be public
find "$(git rev-parse --show-toplevel)" -type f \( \
-name ".env" -o -name "*.pem" -o -name "*.key" -o \
-name "secrets.yaml" -o -name "credentials.json" \
\) ! -path "*/.git/*" 2>/dev/null
2d. Check what already exists
ROOT="$(git rev-parse --show-toplevel)"
for f in README.md LICENSE CITATION.cff .gitignore requirements.txt pyproject.toml setup.py; do
[ -f "$ROOT/$f" ] && echo "EXISTS: $f" || echo "MISSING: $f"
done
2e. Git status
git -C "$(git rev-parse --show-toplevel)" status --short
git -C "$(git rev-parse --show-toplevel)" log --oneline -5
2f. Toolchain and source hygiene gates
Run available non-mutating gates and report skipped tools explicitly:
ROOT="$(git rev-parse --show-toplevel)"
command -v pre-commit >/dev/null && pre-commit run --all-files || true
command -v shellcheck >/dev/null && find "$ROOT" -name "*.sh" -not -path "*/.git/*" -print0 | xargs -0 shellcheck || true
command -v shfmt >/dev/null && shfmt -d "$ROOT" || true
command -v actionlint >/dev/null && actionlint "$ROOT/.github/workflows"/*.yml "$ROOT/.github/workflows"/*.yaml 2>/dev/null || true
command -v nbstripout >/dev/null && find "$ROOT" -name "*.ipynb" -not -path "*/.git/*" -print0 | xargs -0 nbstripout --dry-run || true
command -v lychee >/dev/null && lychee --no-progress "$ROOT/README.md" "$ROOT/docs/**/*.md" || true
command -v taplo >/dev/null && find "$ROOT" -name "*.toml" -not -path "*/.git/*" -print0 | xargs -0 taplo fmt --check || true
command -v yamllint >/dev/null && yamllint "$ROOT" || true
Treat failed required gates as release blockers. Treat missing optional tools as skipped unless the project policy requires them.
Report audit findings to the user as a checklist:
- 🔴 BLOCKER — must fix before release (secrets found, credentials, large binaries)
- 🟡 WARNING — recommended to address (missing README, no license, no .gitignore, skipped optional hygiene gates)
- 🟢 OK — already in good shape
Ask the user to fix any blockers before continuing.
3. Generate missing files
Based on the 2d audit, generate any missing files. Ask the user's preference before overwriting existing files.
LICENSE
If missing, generate the appropriate license file:
- MIT (most common for ML code):
MIT License Copyright (c) {YEAR} {AUTHORS} [standard MIT text] - Apache-2.0: Use standard Apache 2.0 text
- CC-BY-4.0: For data/models if user specified
Write to {PROJECT_ROOT}/LICENSE.
CITATION.cff
Read the template from <installed-skill-dir>/templates/CITATION.cff.
Fill in placeholders:
| Placeholder | Value |
|---|---|
{TITLE} |
paper title |
{AUTHORS_LIST} |
authors as YAML list (see template format) |
{YEAR} |
publication year |
{VENUE} |
conference/journal name |
{ARXIV_ID} |
arXiv ID (if available) |
{GITHUB_REPO} |
github.com/username/repo |
{DATE_RELEASED} |
today's date (YYYY-MM-DD) |
Write to {PROJECT_ROOT}/CITATION.cff.
README.md
If README is missing or skeletal (< 50 lines):
Read the template from <installed-skill-dir>/templates/README_ml_paper.md.
Fill in all placeholders. Leave [TODO: ...] markers where the user must provide content (e.g., exact performance numbers, dataset download links).
Do NOT overwrite a substantial existing README — instead, identify what sections are missing and offer to append them.
.gitignore
If missing, generate a Python/ML .gitignore:
# Python
__pycache__/
*.py[cod]
*.egg-info/
.eggs/
dist/
build/
.venv/
venv/
# ML outputs (keep jobs/ but ignore outputs/)
outputs/
wandb/
*.ckpt
*.pt
*.pth
*.pkl
runs/
lightning_logs/
# Data (large files)
data/
datasets/
# Secrets
.env
*.pem
*.key
secrets.yaml
credentials.json
# IDE
.vscode/settings.json
.idea/
*.DS_Store
4. Pre-release checklist review
Read <installed-skill-dir>/checklist.md for the full item list.
Present the user with a condensed checklist grouped by category. For each item, report status (✅ done / ⚠️ needs attention / ❌ missing):
📁 Repository hygiene
✅ .gitignore covers outputs/, wandb/, *.ckpt
⚠️ outputs/ directory exists — confirm not committed
📄 Documentation
✅ README.md present (142 lines)
❌ No CITATION.cff — will generate
⚠️ README missing "Pre-trained Models" section
🔐 Security
✅ No secrets detected
✅ No large files in git history
🧰 Toolchain gates
✅ pre-commit / ruff / pytest release gates passed
⚠️ lychee skipped — not installed
❌ gitleaks found possible secret in config.yaml
📦 Reproducibility
✅ requirements.txt / pyproject.toml present
⚠️ No environment.yml for conda users
⚠️ No Dockerfile
⚖️ Legal
✅ LICENSE (MIT) present
Ask: "Are you ready to proceed with tagging and publishing, or do you want to fix any of the above first?"
5. Commit and tag
Once the user confirms:
ROOT="$(git rev-parse --show-toplevel)"
# Stage only new/modified tracked files (not untracked outputs)
git -C "$ROOT" add LICENSE CITATION.cff README.md .gitignore 2>/dev/null || true
git -C "$ROOT" diff --staged --stat
Ask the user to confirm the staged changes, then commit:
git -C "$ROOT" commit -m "chore: prepare code for public release ({VERSION})"
Create annotated tag:
git -C "$ROOT" tag -a "{VERSION}" -m "$(cat <<'EOF'
Release {VERSION} — {PAPER_TITLE}
Published at {VENUE}
Paper: {PAPER_URL}
Includes: {WHAT_INCLUDED}
EOF
)"
Ask: "Push commit and tag to origin?"
CURRENT_BRANCH="$(git -C "$ROOT" branch --show-current)"
git -C "$ROOT" push origin "$CURRENT_BRANCH"
git -C "$ROOT" push origin "{VERSION}"
6. GitHub Release (optional)
If the user requested a GitHub release:
Check if gh is available:
gh --version 2>/dev/null && echo "gh available" || echo "gh not found"
If available, draft the release notes and create the release:
gh release create "{VERSION}" \
--title "{PAPER_TITLE} ({VERSION})" \
--notes "$(cat <<'EOF'
## {PAPER_TITLE}
**{VENUE}** | [Paper]({PAPER_URL}) | [Project Page]({PROJECT_PAGE_URL})
### What's included
{INCLUDED_ITEMS}
### Installation
\`\`\`bash
pip install -r requirements.txt
\`\`\`
### Citation
\`\`\`bibtex
{BIBTEX}
\`\`\`
EOF
)"
If gh is not available, print the release notes as text for the user to paste into GitHub.
7. Summary and next steps
Print a final summary:
✅ Release {VERSION} complete!
Files generated/updated:
• LICENSE
• CITATION.cff
• README.md
• .gitignore
Git:
• Commit: {COMMIT_HASH}
• Tag: {VERSION} → pushed to origin
GitHub:
• Release: https://github.com/{REPO}/releases/tag/{VERSION}
Recommended next steps:
□ Upload pre-trained weights to HuggingFace Hub / project page
□ Add repo link to the paper's arXiv abstract page
□ Tweet / post about the release
□ Email the mailing list / post to r/MachineLearning
□ Add a "Code" badge to the paper PDF (camera-ready only)
Handling Anonymous Submissions
When the user says this is for an anonymous conference submission:
- Do NOT create a public GitHub release.
- Do NOT include author names in any committed files.
- Generate an anonymous zip package instead:
ROOT="$(git rev-parse --show-toplevel)" PROJ=$(basename "$ROOT") git -C "$ROOT" archive --format=zip HEAD -o "/tmp/${PROJ}-anonymous.zip" echo "Anonymous zip: /tmp/${PROJ}-anonymous.zip" - Check the zip contents for any author-identifying information before submission.
- Remind the user to anonymize: commit messages visible via git log, personal paths in configs, email addresses in code comments.
Common Patterns
HuggingFace Hub model upload
When the user wants to push weights to HuggingFace:
# Requires: pip install huggingface_hub
python -c "
from huggingface_hub import HfApi
api = HfApi()
api.upload_folder(
folder_path='checkpoints/',
repo_id='username/model-name',
repo_type='model',
)
"
Adding a "Code" badge to README
[](https://github.com/username/repo)
[](https://arxiv.org/abs/2401.00000)
[](LICENSE)
Paper with Code link
After release, submit to paperswithcode.com to link the repository to the paper automatically.
More from a-green-hand-jack/ml-research-skills
project-init
Initialize an ML research project control root. Use for paper/code/slides repos, shared memory, GitHub Project alignment, agent guidance, worktree policy, and lifecycle handoffs.
37project-sync
Sync verified code-side experiment results into paper memory. Use when logs, reports, run docs, or user-confirmed metrics should become paper-facing evidence.
36add-git-tag
Create annotated Git milestone tags. Use when completing a phase, releasing a version, or marking a research checkpoint.
36update-docs
Refresh project documentation after code changes. Use after implementing features, changing behavior, or preparing a milestone commit.
36init-latex-project
Initialize a LaTeX academic paper project. Use for new conference or journal papers needing templates, macros, venue preambles, and writing guidance.
36new-workspace
Create Git branches or worktrees for research code and paper versions. Use for experiments, baselines, rebuttal fixes, arXiv/camera-ready branches, and worktree memory.
36