paper-navigator
Paper Navigator
Find and read academic papers in four stages:
┌──────────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Disambiguate │ → │ Discover │ → │ Evaluate │ → │ Read │
└──────────────┘ └──────────┘ └──────────┘ └──────────┘
↓
┌──────────────────┐
│ research-survey │ (for survey reports)
│ research-ideation│ (for idea generation)
└──────────────────┘
Setup: Scripts are in skills/paper-navigator/scripts/. Run via python skills/paper-navigator/scripts/<name>.py. Optional env vars for higher rate limits: S2_API_KEY (Semantic Scholar), JINA_API_KEY (Jina Reader), GITHUB_TOKEN, HF_TOKEN.
Semantic Scholar Key Gate (MANDATORY)
Before using any Semantic Scholar-dependent script, check whether S2_API_KEY is set.
- If
S2_API_KEYis set: you may usescholar_search,citation_traverse,recommend,author_search,trending, and other S2-backed tools normally. - If
S2_API_KEYis missing: do not use Semantic Scholar. Tell the user that Semantic Scholar is unavailable without a key and ask whether they want to provide one. If they do not provide a key, continue with non-S2 sources only:arxiv_monitor, web search, GitHub search, Hugging Face search, or direct paper URLs/DOIs/arXiv IDs. - Without
S2_API_KEY, skip citation-graph expansion (citation_traverse,recommend) entirely instead of retrying or waiting.
Step 0: Search Strategy Principles (MANDATORY)
Every discovery task MUST follow these principles before executing any workflow.
Query Reformulation
Before searching, decompose the user's topic and generate 4-6 variant queries covering distinct research angles. This is critical because different papers use different terminology for the same concept, and a single research topic often spans multiple sub-communities.
Step 1: Sub-topic decomposition. Identify 3-5 distinct research angles within the user's query. Most research topics span multiple perspectives:
- Empirical vs. theoretical — papers that observe/measure the phenomenon vs. papers that prove/explain it formally
- Mechanism vs. condition — papers about how something works vs. when/why it emerges
- Method keywords — different communities use different terms for the same concept (e.g., "gradient descent" vs. "meta-optimization" vs. "implicit learning")
- Adjacent formulations — the same idea framed differently (e.g., "in-context learning" vs. "few-shot learning" vs. "learning from demonstrations")
Step 2: Generate queries. Create at least one query per identified angle, using synonym substitution, specificity adjustment, and structural variants:
- Synonym substitution: "data pruning" → "data selection", "data filtering", "data curation"
- Specificity adjustment: broaden ("pretraining data quality") or narrow ("perplexity-based data pruning LLM")
- Structural variants: swap word order, add/remove qualifiers, use abbreviations
Example: User asks "how LLMs gain in-context learning during pretraining"
- Angles: (a) mechanistic/circuit, (b) training dynamics, (c) ICL-as-optimization theory, (d) data/task conditions, (e) formal theory
- Query 1:
"in-context learning emergence pretraining language model"(general) - Query 2:
"induction heads formation training transformer"(mechanistic) - Query 3:
"transformers learn in-context gradient descent meta-learning"(optimization view) - Query 4:
"pretraining task diversity data structure in-context learning"(data conditions) - Query 5:
"in-context learning theory linear attention generalization"(formal theory)
Example: User asks "papers about data pruning for LLM pretraining"
- Angles: (a) selection methods, (b) quality metrics, (c) scaling effects
- Query 1:
"data pruning pretraining language model" - Query 2:
"data selection pretraining LLM" - Query 3:
"training data curation large language model quality" - Query 4:
"data quality scoring pretraining scaling"
Multi-Source Parallel Search
Never rely on a single search source. For every discovery task, run at least 2 sources:
- Primary (with
S2_API_KEY):scholar_search(S2 with automatic arXiv fallback on rate limit) - Primary (without
S2_API_KEY):arxiv_monitor --keywords "<variants>" --match-mode flexible - Secondary: web search or GitHub search for recent blog posts, surveys, repos, and paper links
- Tertiary: additional non-S2 sources such as direct arXiv/DOI URLs or Hugging Face dataset/model search when relevant
CRITICAL — S2 usage rule:
- With
S2_API_KEYset: You MAY use S2-backed scripts. Prefer moderate fan-out and keep citation expansion scoped to the user's actual need. - Without
S2_API_KEY: Do not invoke S2-backed scripts at all. Do not “try once anyway”, do not queue retries, and do not runcitation_traverse/recommend. - How to check: Before starting discovery, run
echo $S2_API_KEYor check if the env var is set. If empty, tell the user Semantic Scholar is unavailable without a key and continue with non-S2 sources. - arXiv-only scripts (
arxiv_monitor) are NOT affected by this rule and can always run in parallel with other non-S2 calls.
Rate-Limit-Aware Fallback Chain
When Semantic Scholar returns 429 or empty results:
scholar_searchautomatically falls back to arXiv (built-in since v1.2)- Use
arxiv_monitor --keywordswith--match-mode flexiblefor broader coverage - Switch to web search for blog posts, surveys, GitHub repos that reference papers
- Space S2-dependent calls (
citation_traverse,recommend) at least 5s apart and reduce--limit
Prevention is better than fallback: The arXiv fallback produces lower-quality results (no citation counts, less precise relevance ranking). If S2_API_KEY is missing, skip S2 entirely and use the non-S2 chain from the start.
Mandatory Citation Expansion (for multi-paper discovery tasks)
After finding ≥3 relevant seed papers, you MUST expand coverage using the citation graph only when S2_API_KEY is available. The goal is to discover papers that keyword search cannot reach.
Seed selection: Rank all found relevant papers by citation count. Pick the top 3 as primary seeds.
Expansion steps (all mandatory):
- Co-citation on the single highest-cited seed:
citation_traverse --direction co-citation --limit 15— this is the strongest signal for finding closely related work that uses different terminology - Forward citations on the top 2 seeds:
citation_traverse --direction forward --limit 20— finds follow-up work - Backward citations on 1-2 seeds whose topic coverage differs:
citation_traverse --direction backward --limit 20— finds foundational and adjacent work that seeds build on. Pick seeds from different sub-topics to maximize coverage breadth - Recommendations with diverse seeds:
recommend --positive <seed1>,<seed2>,<seed3>— serendipitous discovery of semantically related work not connected by citations
Seed diversity principle: When selecting seeds for backward traversal or recommendations, prefer seeds from different sub-topics identified in query reformulation. This prevents the citation graph from staying within a single research community.
Applies to: WF1 (Survey), WF3 (Quick Search with >10 results), WF5 (Track Developments), WF9 (Ideation), WF10 (User-specified count), and only when S2_API_KEY is configured.
Does NOT apply to: WF2 (Find specific paper), WF7 (Read paper by URL).
Coverage Gap Check (for multi-paper discovery tasks)
After initial search + citation expansion, review the collected papers against the sub-topics identified during query reformulation.
For each sub-topic angle:
- Count how many collected papers address it
- If a sub-topic has 0-1 papers, run a targeted
scholar_searchwith a query specific to that angle only whenS2_API_KEYis available. Otherwise usearxiv_monitor, web search, or GitHub search for that angle. - If targeted search finds new relevant papers and
S2_API_KEYis available, optionally run one morecitation_traverseorrecommendround on the new finds
This step catches systematic blind spots where an entire research perspective was missed by all prior queries. It is lightweight — typically 1-2 additional searches for gaps, not a full re-search.
Applies to: Same workflows as Mandatory Citation Expansion.
Step 1: Classify Intent and Select Workflow
Start here. Determine what the user wants and route to the right workflow. Match complexity to intent — simple queries get simple answers.
| Intent | Signal | Workflow | Complexity |
|---|---|---|---|
| Find a specific paper | Title, author name, or URL | WF 2 | Single search call |
| Quick paper search | "give me papers about X", "find papers on X" | WF 3 | Single search call |
| Metadata search | Author + year, venue filter | WF 4 | Single search + filter |
| Track recent advances | "latest", "recent", "what's new" | WF 5 | 1-2 calls |
| Find a baseline | Code, SOTA, implementation | WF 6 | Search + code check |
| Read a paper | URL or "read this paper" | WF 7 | Fetch + read |
| Ambiguous term | Project name, module name, nickname | WF 8 | Web search + resolve |
| Literature survey | "survey X", comprehensive coverage | WF 1 → then hand off to research-survey |
Iterative collection |
| Related work map | Connections between papers | WF 1 | Citation traversal |
| Ideation support | Called from research-ideation | WF 9 | Iterative + strict filter |
| User-specified count | "find me exactly N papers about X" | WF 10 | Adaptive |
Key principle: Simple "find me papers about X" queries should return results from a single search call, not trigger the full iterative collection workflow. Only use iterative expansion for comprehensive surveys or ideation support.
Step 2: Resolve Ambiguous Terms (if needed)
When the user's query might be a colloquial name, project name, or module name (rather than a paper title):
- Quick academic search — Try
scholar_searchwith the exact query - If zero results — Broaden the search:
- Web search: Find GitHub repos, blog posts, or social media that reveal the actual paper title or arXiv ID
- GitHub search:
github_search.py --query "USER_QUERY"— repos often link to papers
- Extract identifiers — Actual paper title, arXiv ID, GitHub repo URL, author names
- Re-enter the appropriate workflow with resolved identifiers
Example disambiguation report:
🔍 Disambiguation Report for "deepseek engram"
├── Intent: Track recent advances (ambiguous term)
├── Resolution: "Engram" is a module name from DeepSeek AI
│ ├── Actual paper: "Conditional Memory via Scalable Lookup" (ArXiv:2601.07372)
│ └── GitHub: https://github.com/deepseek-ai/Engram
└── Search Plan:
├── scholar_search --query "Conditional Memory Scalable Lookup" --sort-by year
├── citation_traverse --paper-id ArXiv:2601.07372 --direction forward
└── github_search --query "deepseek engram"
Standard Output Formats
Use these formats when presenting results to the user. Match the format to the intent.
Format A: Single Paper Card (for navigational search, WF 2)
📄 **Highly accurate protein structure prediction with AlphaFold**
Authors: Jumper et al.
Year: 2021 | Venue: Nature
Citations: 25,000+
DOI: 10.1038/s41586-021-03819-2 | S2 ID: 235959867
Link: https://doi.org/10.1038/s41586-021-03819-2
TLDR: End-to-end neural network for protein structure prediction achieving atomic accuracy...
Format B: Paper List Table (for quick search, metadata search, trending — WF 3/4/5)
| # | Title | Authors | Year | Venue | Citations | ID |
|---|-------|---------|------|-------|-----------|-----|
| 1 | Paper Title | First Author et al. | 2024 | NeurIPS | 150 | arXiv:2401.xxxxx |
| 2 | ... | ... | ... | ... | ... | ... |
After the table, briefly note how many results were found and whether the list was filtered.
Format C: Baseline Recommendation (for baseline hunt, WF 6)
📦 **Recommended Baseline: [Model Name]**
Paper: [Title] ([Year], [Venue]) — [arXiv ID]
Code: [GitHub URL] ⭐ [stars] | Framework: [PyTorch/TF]
Performance: [key metric = value] on [dataset]
HuggingFace: [model page URL] | Downloads: [N]
Format D: Reading Notes (for read a paper, WF 7)
Use the template at assets/paper-summary-template.md. Save to /artifacts/paper-notes/{paper-id}.md.
Format E: Disambiguation Report (for ambiguous queries, WF 8)
🔍 Disambiguation Report for "[query]"
├── Intent: [classified intent]
├── Resolution: [what the term actually refers to]
│ ├── Paper: [resolved title] ([arXiv ID])
│ └── Code: [GitHub URL]
└── Search Plan:
├── [script call 1]
└── [script call 2]
Common Workflows
Workflow 1: Collect Papers for Survey
"Help me survey CRISPR-based gene therapy for sickle cell disease"
Use iterative collection (target 30-80 papers). See Appendix A for the full iterative methodology.
- Discover: If
S2_API_KEYis available, start withscholar_search --query "CRISPR gene therapy sickle cell" --limit 20 --sort-by citations→ iterative expansion with EXPLORE/EXPLOIT strategy →citation_traverse --direction forwardon seminal papers. If the key is missing, tell the user Semantic Scholar is unavailable without a key, then usearxiv_monitor+ web/GitHub search instead and skip citation-graph expansion. - Evaluate: Review each paper's title + abstract for relevance → filter by abstract quality → prefer top-tier venues → shortlist
- Read:
fetch_paperfor key papers → L2 reading → notes usingassets/paper-summary-template.md - Hand off to
research-surveyto synthesize the collected papers into a structured survey report
Workflow 2: Navigational Search
"Find me the attention is all you need paper" "Find me the original GPT 3 paper"
- Discover: If
S2_API_KEYis available, usescholar_search --query "Attention Is All You Need"— single call, return top result. If the key is missing, ask whether the user wants to provide one; otherwise resolve via arXiv/DOI/web search and continue without S2. - Output: Use Format A (Single Paper Card)
Do NOT proceed to Read unless the user explicitly asks.
Workflow 3: Quick Paper Search
"Give me papers about perovskite solar cell stability under humidity" "Find papers on gut microbiome modulation for autoimmune diseases"
- Sub-topic decomposition + query reformulation: Identify 3-5 research angles within the topic, generate 4-6 variant queries covering distinct angles (see Step 0)
- Discover: If
S2_API_KEYis set, runscholar_search --query "<variant>" --limit 20 --sort-by relevanceon each variant. If the key is missing, tell the user Semantic Scholar is unavailable without a key, skipscholar_search, and usearxiv_monitor --keywords "<variants>" --match-mode flexibleplus web/GitHub search instead. - Citation expansion (if initial results ≥ 3 relevant papers and
S2_API_KEYis available): Follow Mandatory Citation Expansion (Step 0) — co-citation on highest-cited seed, forward on top 2, backward on 1-2 diverse seeds, recommend with 3 seeds - Coverage gap check: Review collected papers against identified sub-topics. Run targeted searches for uncovered angles using S2 only when the key is available; otherwise use non-S2 sources
- Filter: Review all results, deduplicate, keep relevant papers based on title + abstract
- Output: Use Format B (Paper List Table)
Only escalate to full iterative workflow (WF1) if results are clearly insufficient or the user explicitly asks for more.
Workflow 4: Metadata Search
"2012 papers by David Harel" "Papers by David Harel from 2020 to 2022" "Journal articles by David Harel from 2020 to 2022"
- Parse query: Extract author name, year range, venue type (journal/conference)
- Discover:
author_search --name "David Harel" --papers --limit 50 --sort-by year - Filter: Year range, venue type (check
venuefield), other attributes - Output: Use Format B (Paper List Table)
For keyword + year filter (no author): scholar_search --query "<keywords>" --year-min YYYY --year-max YYYY
Workflow 5: Track Field Developments
"What's new in condensed matter physics this week?"
- Discover:
arxiv_monitor --categories cond-mat --days 7(seereferences/arxiv-categories.mdfor codes) +trending --query "topological insulator" --period 30 - Output: Use Format B (Paper List Table), highlight high-potential papers with TLDRs
Workflow 6: Find a Baseline with Code
"I need a baseline for protein structure prediction with code"
- Discover:
scholar_search --query "protein structure prediction" --sort-by citations - Evaluate:
find_codeon top results +sota --task "protein-structure-prediction"→ pick one with official code + high downloads - Output: Use Format C (Baseline Recommendation)
Workflow 7: Read a Paper by URL
"Read this paper: arxiv.org/abs/2301.12345"
Output: Use Format D (Reading Notes)
- Fetch:
fetch_paper --url "https://arxiv.org/abs/2301.12345" - Choose reading depth (see
references/reading-strategy.md):
| Level | Goal | When to use | Effort |
|---|---|---|---|
| L1 Technical | Can reimplement | Building directly on this paper | High |
| L2 Analytical | Understand motivation + design choices | Most papers in a survey | Medium |
| L3 Contextual | Know what it is and where it fits | Quick scanning | Low |
- Take notes using
assets/paper-summary-template.md. Save to/artifacts/paper-notes/{paper-id}.md.
Workflow 8: Ambiguous Query Resolution
"Find the latest about deepseek engram"
- Disambiguate: Follow Step 2 above
- Discover:
scholar_searchwith resolved title +github_searchwith original term +citation_traverseon arXiv ID - Evaluate: Review results, check code via
find_codeor GitHub - Read:
fetch_paperfor top papers - If user wants a survey: hand off to
research-survey
Workflow 9: Ideation Support (called from research-ideation)
research-ideation Step 2 needs papers to build a literature tree
Iterative collection with strict filter (target 30-50 papers, recent 2020+). See Appendix A and Appendix B.
- Disambiguate: Parse the research goal → extract domain + method type
- Discover: Initial broad search (60 candidates) → iterative expansion up to 15 rounds:
- EXPLORE: new keyword queries for diverse sub-areas
- EXPLOIT:
citation_traverseorrecommendon strongly relevant papers
- Evaluate: Only keep strongly relevant papers. Prefer top-tier venues + 2020+ papers.
- Deduplicate: Track seen titles and abstracts.
- Output: 30-50 high-quality papers → feed into novelty tree + challenge-insight tree.
Workflow 10: User-Specified Paper Count
"Find me exactly 15 papers about reinforcement learning from human feedback"
- Use the user's number as the target
- Apply the closest profile's quality settings
- Run iterative collection until target met or max iterations exhausted
- If not enough, progressively relax relevance standard and inform the user
Discovery Paths (Stage 1 Detail)
Seven paths, used by workflows above.
Path A: Keyword Search (most common)
python scripts/scholar_search.py --query "transformer attention mechanism" --limit 20 --sort-by citations
Options: --year-min/--year-max, --open-access-only, --sort-by relevance|citations|year.
Path B: Citation Traversal
# Forward — who cited this paper
python scripts/citation_traverse.py --paper-id ArXiv:1706.03762 --direction forward --limit 20
# Backward — what this paper cites
python scripts/citation_traverse.py --paper-id ArXiv:1706.03762 --direction backward --limit 20
# Co-citation — papers frequently cited alongside this one (most powerful for finding related work)
python scripts/citation_traverse.py --paper-id ArXiv:1706.03762 --direction co-citation --limit 15
Path C: Recommendations
python scripts/recommend.py --positive ArXiv:1706.03762,ArXiv:2005.14165 --limit 15
python scripts/recommend.py --positive ArXiv:1706.03762 --negative ArXiv:2301.00001 --limit 10
Path D: Author Tracking
python scripts/author_search.py --name "Geoffrey Hinton" --papers --limit 20 --sort-by citations
Path E: arXiv Monitoring
python scripts/arxiv_monitor.py --categories cs.CL,cs.AI --days 3 --limit 30
python scripts/arxiv_monitor.py --keywords "chain of thought,reasoning" --days 7
python scripts/arxiv_monitor.py --keywords "data pruning pretraining" --match-mode flexible --days 365
Options: --match-mode flexible (default, AND-of-words for better recall) or --match-mode exact (phrase matching for precision). See references/arxiv-categories.md for category codes.
Path F: Trending Detection
python scripts/trending.py --query "large language models" --period 90 --limit 15
Ranks by citation velocity (citations/month).
Path G: GitHub Search
python scripts/github_search.py --query "deepseek engram" --limit 10
python scripts/github_search.py --query "mamba state space model" --sort stars
Useful when papers haven't been published on arXiv yet or industry labs release code before papers.
Citation Graph Visualization
After traversal, visualize with Mermaid (keep ≤30 nodes):
graph TD
SEED["Attention Is All You Need<br/>2017 · 100k+"]
A["BERT · 2018"] --> SEED
B["GPT-2 · 2019"] --> SEED
C["Vision Transformer · 2020"] --> SEED
Evaluation Tools (Stage 2 Detail)
Quick Assessment (from scholar_search output)
| Signal | What it tells you |
|---|---|
| TLDR | One-sentence understanding |
| Citation count | Overall impact |
| Influential citations | Quality of impact |
| Year + venue | Recency and authority |
| Open Access PDF | Whether you can read full text |
Code Availability
python scripts/find_code.py --arxiv-id 1706.03762
Top Models by Task
python scripts/sota.py --task "text-generation" --limit 10
python scripts/sota.py --task "translation" --list-tasks
Dataset Discovery
python scripts/dataset_search.py --query "sentiment analysis" --limit 10
Reproducibility Assessment
| Dimension | Check |
|---|---|
| Code | Open-source? Official? Stars? Last update? |
| Results | Reproduced on SOTA leaderboard? |
| Data | Dataset publicly available? |
| Overall | High / Medium / Low / None |
After Collecting Papers: Next Steps
| Goal | Hand off to |
|---|---|
| Generate a literature survey report | research-survey — synthesizes papers into a structured 8-section report |
| Generate research ideas | research-ideation — builds novelty tree + challenge-insight tree from papers |
| Write a Related Work section | paper-writing — uses paper notes as input |
Quick Report (optional, stays in paper-navigator)
For a brief summary table without a full survey report, use literature_report.py:
python scripts/literature_report.py --paper-ids ArXiv:2601.07372,ArXiv:2501.12948 --intent quick_scan
| Intent | Output |
|---|---|
quick_scan |
Brief table: title, authors, year, citations, TLDR |
baseline_hunt |
Code availability, SOTA position, dataset access, reproducibility |
For full survey reports (survey, deep_dive intents), use research-survey instead.
Appendix A: Iterative Collection Workflow
For workflows requiring many papers (survey, ideation support), use iterative expand-and-filter:
1. Parse query → extract goal, search terms, key term definitions
2. Define task attributes → identify domain + method type
3. Initial search → scholar_search with broad query
4. Review each paper's title + abstract → judge relevance (keep/reject)
5. LOOP until target met or max iterations reached:
a. From kept papers, pick the most relevant as "grounding set"
b. Generate next search query:
- EXPLORE: new keyword query to broaden coverage
- EXPLOIT: citation_traverse or recommend on a high-relevance paper
c. Fetch new papers → review → deduplicate → add to collection
6. Final filter: apply quality checks, take top N
Relevance judging: You (Claude) evaluate each paper directly from title + abstract against the user's goal. No separate API call needed.
Deduplication: Track seen titles (normalized) and abstract prefixes. Skip already-evaluated papers.
Quality filtering:
- Skip papers with very short abstracts (< 20 words)
- For ideation/survey: prefer top-tier venues and journals in the user's field (e.g., Nature, Science, Cell, Lancet, PNAS for broad science; field-specific top venues like NeurIPS/ICML for ML, Physical Review Letters for physics, JACS for chemistry, etc.)
- For ideation: prefer 2020+ papers; include older only if foundational
Appendix B: Ideation vs Survey Collection
| Aspect | Ideation Support | Literature Survey |
|---|---|---|
| Goal | Find gaps and transferable techniques | Comprehensive field coverage |
| Relevance standard | Strict — only strongly relevant | Moderate — include tangentially relevant |
| Recency | Strong bias toward 2020+ | Include foundational older work |
| Initial search size | 60 candidates | 20 candidates |
| Coverage strategy | Deep on core topic + cross-domain | Balanced across sub-topics |
| Output use | Novelty tree + challenge-insight tree | Comprehensive report |
Appendix C: Script & API Reference
All scripts output Markdown to stdout, errors to stderr. Common flags: --limit N, --json.
Paper ID Formats
Scripts accept and normalize automatically: S2 ID, arXiv (ArXiv:1706.03762 or 1706.03762 or URL), DOI (DOI:10.18653/v1/N18-3011).
Rate Limits
| API | Without key | With key | When rate limited |
|---|---|---|---|
| Semantic Scholar | Disabled — set S2_API_KEY to enable |
100 req/min; parallel OK | scholar_search auto-falls back to arXiv; citation_traverse/recommend require the key |
| arXiv | 1 req/3s (courtesy) | N/A | Primary fallback when S2 is limited; no auth needed |
| Jina Reader | Free tier | Higher with key | — |
| HuggingFace | 500 req / 300s | Higher with HF_TOKEN |
— |
| GitHub | 10 req/min | 5,000 req/hr (set GITHUB_TOKEN) |
— |
All scripts retry on 429 and 5xx errors with exponential backoff (3s, 6s, 12s, 24s, 48s — 5 retries). A global S2 request pacer enforces minimum interval between Semantic Scholar API calls to prevent budget exhaustion.
For detailed API endpoints, query parameters, and field specifications, see references/api-reference.md.
Integration
- research-survey: After collecting papers, hand off to research-survey for structured survey report generation (8-section goal-centric synthesis).
- research-ideation: After collecting papers, hand off to research-ideation for idea generation (novelty tree + challenge-insight tree + problem selection + solution design).
- experiment-pipeline: After finding a baseline via Workflow 6, hand off to experiment-pipeline.
- paper-writing: Paper notes serve as input for paper-writing's Related Work section.
More from evoscientist/evoskills
paper-review
Guides self-review of YOUR OWN academic paper before submission with adversarial stress-testing. Core method: 5-aspect checklist (contribution sufficiency, writing clarity, results quality, testing completeness, method design), counterintuitive protocol (reject-first simulation, delete unsupported claims, score trust, promote limitations, attack novelty), reverse-outlining, and figure/table quality checks. Use when: user wants to self-review or self-check their own paper draft before submission, stress-test their claims, prepare for reviewer criticism, or mentions 'self-review', 'check my draft', 'is my paper ready'. Do NOT use for writing a peer review of someone else's paper, and do NOT use after receiving actual reviews (use paper-rebuttal instead).
265paper-writing
Guides writing academic papers section by section using an 11-step workflow with LaTeX templates and counterintuitive writing tactics. Covers Abstract, Introduction, Method, Experiments, Related Work, Conclusion, and Supplementary. Use when: user asks to write or draft a paper section, needs LaTeX templates, wants to improve academic writing quality, optimize novelty framing, or mentions 'write introduction', 'draft method', 'paper writing'. Do NOT use for pre-submission review (use paper-review), experiment execution (use experiment-pipeline), or paper planning/story design (use paper-planning).
250paper-rebuttal
Guides writing effective rebuttals after receiving peer review feedback. Covers review diagnosis (score-driven color-coding), response strategy (champion identification, common-theme consolidation), tactical writing (18 rules), and counterintuitive rebuttal principles. Use when: user received reviewer scores/comments, needs to write a rebuttal or author response, wants to respond to specific criticism (e.g. 'limited novelty', 'missing baselines'), mentions 'rebuttal', 'reviewer comments', 'author response', or 'respond to reviewers'. Do NOT use for pre-submission self-review (use paper-review instead).
244paper-planning
Guides pre-writing planning for academic papers with 4 structured steps: story design (task-challenge-insight-contribution-advantage), experiment planning (comparisons + ablations), figure design (pipeline + teaser), and 4-week timeline management. Includes counterintuitive planning tactics (write a mock rejection letter to identify weaknesses before writing, narrow before broad claims, design ablations first). Use when: user wants to plan a paper before writing, design story/contributions, plan experiments, create figure sketches, set a writing timeline, or write a pre-emptive rejection letter for planning purposes. Do NOT use for actual writing (use paper-writing), running experiments (use experiment-pipeline), self-reviewing a finished draft (use paper-review), or finding research problems (use research-ideation).
239research-ideation
End-to-end research ideation pipeline: literature grounding → multi-track idea generation (3 personas: innovator/pragmatist/critic) → iterative refinement → ELO tournament ranking → update evo-memory (IDE) → user selects direction → expand into manuscript-quality proposal. Use when: user wants to find a research direction, brainstorm ideas, evaluate idea novelty, design a novel solution, rank/compare research ideas, or generate a research proposal. Do NOT use for finding/searching/reading papers (use paper-navigator), literature survey reports (use research-survey), or planning a paper (use paper-planning).
235experiment-pipeline
Guides structured 4-stage experiment execution with attempt budgets and gate conditions: Stage 1 initial implementation (reproduce baseline), Stage 2 hyperparameter tuning, Stage 3 proposed method validation, Stage 4 ablation study. Integrates with evo-memory (load prior strategies, trigger IVE/ESE) and experiment-craft (5-step diagnostic on failure). Use when: user has a planned experiment, needs to reproduce baselines, organize experiment workflow, or systematically validate a method. Do NOT use for debugging a specific experiment failure (use experiment-craft) or designing which experiments to run (use paper-planning).
229