ai-research-skills
Installation
SKILL.md
AI Research Skills
86 skills powering autonomous AI research in 2026
Keyword:
ai-research-skills·autoresearch·ml experimentsSource: Orchestra-Research/AI-Research-SKILLs | Fork: akillness/AI-Research-SKILLs
When to use this skill
- Conducting autonomous AI/ML research from idea to paper
- Fine-tuning LLMs with Axolotl, LLaMA-Factory, PEFT, or Unsloth
- Running post-training (RLHF, GRPO, DPO, SimPO, verl)
- Distributed training with Megatron-Core, DeepSpeed, FSDP, or Accelerate
- Optimizing inference with vLLM, TensorRT-LLM, llama.cpp, or SGLang
- Building RAG pipelines (Chroma, FAISS, Pinecone, Qdrant)
- Mechanistic interpretability with TransformerLens, SAELens, pyvene
- Writing ML papers (LaTeX templates for NeurIPS, ICML, ICLR, ACL)
- Running ML benchmarks and evaluations (lm-eval-harness, BigCode, NeMo Evaluator)
- Multimodal tasks: CLIP, Whisper, LLaVA, Stable Diffusion, SAM
Do not use this skill when
- You need a simple code fix unrelated to ML/AI research
- You want general software engineering workflows (use
omg,bmad, orralphinstead)
Overview: 86 Skills × 22 Categories
| Category | Count | Key Skills |
|---|---|---|
| Autoresearch | 1 | Autonomous research orchestration (central layer) |
| Model Architecture | 5 | LitGPT, Mamba, RWKV, NanoGPT, TorchTitan |
| Fine-Tuning | 4 | Axolotl, LLaMA-Factory, PEFT, Unsloth |
| Post-Training | 8 | TRL, GRPO, OpenRLHF, SimPO, verl, slime, miles, torchforge |
| Distributed Training | 6 | DeepSpeed, FSDP, Accelerate, Megatron-Core, Lightning, Ray Train |
| Optimization | 6 | Flash Attention, bitsandbytes, GPTQ, AWQ, HQQ, GGUF |
| Inference & Serving | 4 | vLLM, TensorRT-LLM, llama.cpp, SGLang |
| RAG | 5 | Chroma, FAISS, Pinecone, Qdrant, Sentence Transformers |
| Multimodal | 7 | CLIP, Whisper, LLaVA, BLIP-2, SAM, Stable Diffusion, AudioCraft |
| Mech Interp | 4 | TransformerLens, SAELens, pyvene, nnsight |
| Safety & Alignment | 4 | Constitutional AI, LlamaGuard, NeMo Guardrails, Prompt Guard |
| Evaluation | 3 | lm-eval-harness, BigCode, NeMo Evaluator |
| MLOps | 3 | W&B, MLflow, TensorBoard |
| Agents | 4 | LangChain, LlamaIndex, CrewAI, AutoGPT |
| Prompt Engineering | 4 | DSPy, Instructor, Guidance, Outlines |
| Observability | 2 | LangSmith, Phoenix |
| Infrastructure | 3 | Modal, Lambda Labs, SkyPilot |
| Data Processing | 2 | NeMo Curator, Ray Data |
| Tokenization | 2 | HuggingFace Tokenizers, SentencePiece |
| Emerging Techniques | 6 | MoE, Model Merging, Long Context, Speculative Decoding, Distillation, Pruning |
| ML Paper Writing | 1 | LaTeX templates (NeurIPS, ICML, ICLR, ACL, AAAI, COLM) |
| Ideation | 2 | Research Brainstorming, Creative Thinking |
Instructions
Step 1: Install the library
# Interactive installer (auto-detects Claude Code, Codex, Gemini, Cursor)
npx @orchestra-research/ai-research-skills
# Install all 86 skills non-interactively
npx @orchestra-research/ai-research-skills install --all
# Or use the install script from this skill
bash scripts/install.sh
After installation, restart your agent session so skills are loaded.
Step 2: Start autonomous research (autoresearch)
For full autonomous research (idea → experiments → paper):
Read the autoresearch SKILL.md and follow its instructions to begin.
The autoresearch skill orchestrates:
- Literature survey and ideation
- Experiment design and execution (routes to domain skills)
- Results synthesis and benchmarking
- Paper writing with LaTeX templates
Step 3: Use domain skills directly
For targeted work on a specific framework, call the skill by keyword:
# Fine-tuning
fine-tune with axolotl # → activates axolotl skill
# Post-training / RLHF
run grpo training # → activates GRPO skill
# Inference optimization
optimize with vllm # → activates vLLM skill
# Distributed training
setup deepspeed # → activates DeepSpeed skill
Step 4: Claude Code marketplace (alternative install)
# Add marketplace
/plugin marketplace add orchestra-research/AI-research-SKILLs
# Install by category
/plugin install fine-tuning@ai-research-skills
/plugin install post-training@ai-research-skills
/plugin install inference-serving@ai-research-skills
/plugin install distributed-training@ai-research-skills
/plugin install optimization@ai-research-skills
Step 5: Update or manage skills
# Update all installed skills
npx @orchestra-research/ai-research-skills update
# List installed skills
npx @orchestra-research/ai-research-skills list
Autonomous Research Loop
The autoresearch skill uses a two-loop architecture:
Outer Loop (Synthesis):
↓ Research question → Literature survey → Hypothesis
↓ Route to domain skills
Inner Loop (Optimization):
↓ Run experiment → Collect results → Analyze → Adjust
↑ Ratchet improvements via git
↓ Synthesize findings → Write paper
This enables fully autonomous overnight GPU experiments (Karpathy-style ratchet via git).
Examples
Example 1: Start autonomous research
Activate ai-research-skills.
Read the autoresearch SKILL.md and begin research on:
"Does LoRA training stability correlate with layer-wise norm heterogeneity?"
The agent will: survey literature → design experiments → fine-tune with LoRA → run benchmarks → analyze results → write paper.
Example 2: Fine-tune Llama 3 with LoRA
Use the fine-tuning skill (axolotl) to fine-tune Llama-3.1-8B with LoRA
on my dataset at ./data/train.jsonl with 4-bit quantization.
Example 3: Optimize inference with vLLM
Set up vLLM for serving Mistral-7B with tensor parallelism on 2 GPUs,
with continuous batching and PagedAttention. Target: <50ms TTFT.
Example 4: Run GRPO post-training
Implement GRPO training for my reward model using TRL.
Dataset: ./data/preferences.json. Base: Llama-3.1-8B-Instruct.
Architecture: Skill Structure
Each of the 86 skills follows this structure:
skill-name/
├── SKILL.md # Expert guidance (200–600 lines)
├── references/ # Official docs, API refs, GitHub issues, release notes
│ ├── README.md
│ ├── api.md
│ ├── tutorials.md
│ ├── issues.md # Real GitHub issues with solutions
│ └── releases.md
├── scripts/ # Helper scripts (optional)
└── templates/ # Code templates (optional)
Best practices
- Start with autoresearch — it routes to the right domain skills automatically
- Restart after install — skills load at session start; restart if newly installed skills aren't recognized
- Use the two-loop architecture — let the inner loop optimize, outer loop synthesize
- Reference real GitHub issues — each skill's
references/issues.mdcontains battle-tested solutions - Combine with oh-my-gods orchestration — use
ralphfor persistence,bmadfor structured phases,surveyfor landscape scanning before research
Integration with oh-my-gods
| oh-my-gods skill | Integration |
|---|---|
survey |
Pre-research landscape scan before launching autoresearch |
ralph |
Persistent loop — keep autoresearch running until paper complete |
bmad |
Structured phases for the research lifecycle |
autoresearch |
Native skill within this library (enhanced) |