semantic-scholar
Semantic Scholar Paper Search
Search topic or paper ID: $ARGUMENTS
Role & Positioning
This skill is the published venue counterpart to /arxiv:
| Skill | Source | Best for |
|---|---|---|
/arxiv |
arXiv API | Latest preprints, cutting-edge unrefereed work |
/semantic-scholar |
Semantic Scholar API | Published journal/conference papers (IEEE, ACM, Springer, etc.) with citation counts, venue info, TLDR |
Do NOT duplicate arXiv's job. If results contain an externalIds.ArXiv field, the paper is also on arXiv — note this but do not re-fetch from arXiv.
Constants
- MAX_RESULTS = 10 — Default number of search results.
- FETCH_SCRIPT —
tools/semantic_scholar_fetch.pyrelative to the project root. Fall back to inline Python if not found. - DEFAULT_FILTERS — For general research queries, apply these by default to reduce noise:
--fields-of-study "Computer Science,Engineering"--publication-types JournalArticle,Conference
Overrides (append to arguments):
/semantic-scholar "topic" - max: 20— return up to 20 results/semantic-scholar "topic" - type: journal— only journal articles/semantic-scholar "topic" - type: conference— only conference papers/semantic-scholar "topic" - min-citations: 50— only highly-cited papers/semantic-scholar "topic" - year: 2022-— papers from 2022 onward/semantic-scholar "topic" - fields: all— remove default field-of-study filter/semantic-scholar "topic" - sort: citations— bulk search sorted by citation count/semantic-scholar "DOI:10.1109/..."— fetch a single paper by DOI
Workflow
Step 1: Parse Arguments
Parse $ARGUMENTS for directives:
- Query or ID: main search term, or a paper identifier:
- DOI:
10.1109/TWC.2024.1234567 - Semantic Scholar ID:
f9314fd99be5f2b1b3efcfab87197d578160d553 - ArXiv:
ARXIV:2006.10685 - Corpus:
CorpusId:219792180
- DOI:
- max: N: override MAX_RESULTS- type: journal|conference|review|all: map to--publication-types- min-citations: N: map to--min-citations- year: RANGE: map to--year(e.g.2022-,2020-2024)- fields: FIELDS: override--fields-of-study(useallto remove filter)- sort: citations|date: usesearch-bulkwith--sort citationCount:descorpublicationDate:desc
If the argument matches a DOI pattern (10.XXXX/...), a Semantic Scholar ID (40-char hex), or a prefixed ID (ARXIV:..., CorpusId:...), skip search and go directly to Step 3.
Step 2: Search Papers
Locate the fetch script:
SCRIPT=$(find tools/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)
[ -z "$SCRIPT" ] && SCRIPT=$(find ~/.claude/skills/semantic-scholar/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)
Standard search (default — relevance-ranked):
python3 "$SCRIPT" search "QUERY" --max MAX_RESULTS \
--fields-of-study "Computer Science,Engineering" \
--publication-types JournalArticle,Conference
Bulk search (when - sort: is specified, or MAX_RESULTS > 100):
python3 "$SCRIPT" search-bulk "QUERY" --max MAX_RESULTS \
--sort citationCount:desc \
--fields-of-study "Computer Science" \
--year "2020-"
If semantic_scholar_fetch.py is not found, fall back to inline Python using urllib against https://api.semanticscholar.org/graph/v1/paper/search.
Recommended filter combos (from testing):
| Goal | Flags |
|---|---|
| High-quality journal papers | --publication-types JournalArticle --min-citations 10 |
| CS/EE papers, recent | --fields-of-study "Computer Science,Engineering" --year "2022-" |
| Foundational / high-impact | search-bulk --sort citationCount:desc --fields-of-study "Computer Science" |
| Conference papers only | --publication-types Conference |
Note:
--venuerequires exact venue names (e.g. "IEEE Transactions on Signal Processing"), not partial matches like "IEEE". Avoid using--venuein automated flows — prefer--publication-types+--fields-of-study.
Step 3: Fetch Details for a Specific Paper
When a single paper ID is requested:
python3 "$SCRIPT" paper "PAPER_ID"
Where PAPER_ID can be:
- DOI:
10.1109/TSP.2021.3071210 - ArXiv:
ARXIV:2006.10685 - CorpusId:
CorpusId:219792180 - S2 ID:
f9314fd99be5f2b1b3efcfab87197d578160d553
Step 4: De-duplicate Against arXiv
For each result, check externalIds.ArXiv:
- If present → paper is also on arXiv. Note this in output but do NOT re-fetch via
/arxiv. - If absent → paper is venue-only (e.g. IEEE without preprint). This is the unique value of this skill.
Step 5: Present Results
Present results as a table:
| # | Title | Venue | Year | Citations | Authors | Type |
|---|-------|-------|------|-----------|---------|------|
| 1 | Deep Learning Enabled... | IEEE Trans. Signal Process. | 2021 | 1364 | Xie et al. | Journal |
For each paper, also show:
- DOI link:
https://doi.org/DOI(for IEEE/ACM papers, this is the canonical link) - Open Access PDF: if
openAccessPdf.urlis non-empty, show it - TLDR: if available, show the one-line summary
- Also on arXiv: if
externalIds.ArXivexists, note the arXiv ID
Step 6: Detailed Summary
For each paper (or top 5 if many results):
## [Title]
- **Venue**: [venue name] ([publicationVenue.type]: journal/conference)
- **Year**: [year] | **Citations**: [citationCount]
- **Authors**: [full author list]
- **DOI**: [doi link]
- **Fields**: [fieldsOfStudy]
- **TLDR**: [tldr.text if available]
- **Abstract**: [abstract]
- **Open Access**: [openAccessPdf.url or "Not available"]
- **Also on arXiv**: [ArXiv ID if exists, else "No"]
Step 7: Update Research Wiki (if active)
Required when research-wiki/ exists in the project; skip silently
otherwise. Ingest the papers presented to the user. For results with an
externalIds.ArXiv field, use --arxiv-id; for venue-only papers (no
arXiv mirror — common for IEEE/ACM), fall back to manual metadata:
if [ -d research-wiki/ ]:
for each paper in results:
if paper.externalIds.ArXiv:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--arxiv-id "<ArXiv>"
else:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--title "<title>" --authors "<authors joined by , >" \
--year <year> --venue "<venue>" \
[--external-id-doi "<externalIds.DOI>"]
The helper handles slug / dedup / page / index / log — do not
handwrite papers/<slug>.md. See
shared-references/integration-contract.md.
Backfill with /research-wiki sync --arxiv-ids <id1>,<id2>,... for
arXiv-available papers.
Step 8: Final Output
Summarize what was done:
Found N published papers for "query"Filters applied: [publication types, fields, year range, etc.]N papers are venue-only (not on arXiv)Wiki-ingested N papers(ifresearch-wiki/was present)
Suggest follow-up skills:
/arxiv "topic" - search arXiv preprints (complements this search)
/research-lit "topic" - multi-source review: Zotero + local PDFs + arXiv + S2
/novelty-check "idea" - verify novelty against literature
Key Rules
- Default to filtered search: Always apply
--fields-of-studyand--publication-typesunless user says- fields: all. Without filters, S2 returns cross-discipline noise (linguistics, psychology, etc.). - Citation count is gold: S2's citation data is its main advantage over arXiv. Always show
citationCountprominently and use it to rank/prioritize results. - Venue metadata matters: Show
venueandpublicationVenue.type(journal vs conference) — this helps users assess paper quality. - DOI is the canonical ID for published papers: Always show DOI links for IEEE/ACM/Springer papers.
- Rate limiting: S2 API without key is heavily rate-limited (~1 req/s, strict cooldown). If HTTP 429 occurs, wait and retry. Recommend users set
SEMANTIC_SCHOLAR_API_KEYenv var for higher limits (free at https://www.semanticscholar.org/product/api#api-key-form). - TLDR may be null: Some publishers (notably IEEE) elide the TLDR field. Fall back to showing the first sentence of the abstract.
- openAccessPdf may be empty: Many IEEE papers are closed access. Always provide the DOI link as fallback.
- If the S2 API is unreachable, suggest using
/arxivor/research-lit "topic" - sources: webas fallback.
More from wanshuiyin/auto-claude-code-research-in-sleep
idea-creator
Generate and rank research ideas given a broad direction. Use when user says "找idea", "brainstorm ideas", "generate research ideas", "what can we work on", or wants to explore a research area for publishable directions.
126idea-discovery
Workflow 1: Full idea discovery pipeline. Orchestrates research-lit → idea-creator → novelty-check → research-review to go from a broad research direction to validated, pilot-tested ideas. Use when user says \"找idea全流程\", \"idea discovery pipeline\", \"从零开始找方向\", or wants the complete idea exploration workflow.
123auto-review-loop
Autonomous multi-round research review loop. Repeatedly reviews via Codex MCP, implements fixes, and re-reviews until positive assessment or max rounds reached. Use when user says "auto review loop", "review until it passes", or wants autonomous iterative improvement.
116research-lit
Search and analyze research papers, find related work, summarize key ideas. Use when user says "find papers", "related work", "literature review", "what does this paper say", or needs to understand academic papers.
115research-pipeline
Full research pipeline: Workflow 1 (idea discovery) → implementation → Workflow 2 (auto review loop) → Workflow 3 (paper writing, optional). Goes from a broad research direction all the way to a polished PDF. Use when user says \"全流程\", \"full pipeline\", \"从找idea到投稿\", \"end-to-end research\", or wants the complete autonomous research lifecycle.
114pixel-art
Generate pixel art SVG illustrations for READMEs, docs, or slides. Use when user says "画像素图", "pixel art", "make an SVG illustration", "README hero image", or wants a cute visual.
114