biorxiv-database
SKILL.md
bioRxiv Database
A Python toolkit for programmatic access to bioRxiv preprints. Supports comprehensive metadata retrieval with structured JSON output for integration into research workflows.
Use Cases
- Query recent preprints by topic or research domain
- Monitor publications from specific researchers
- Perform systematic literature reviews
- Analyze publication trends across time periods
- Retrieve citation metadata and DOIs
- Download preprint PDFs for text analysis
- Filter results by subject category
Quick Start
# Install dependencies
pip install requests
# Search by keywords
python scripts/biorxiv_client.py --terms "protein folding" --recent 30 --out results.json
# Search by author
python scripts/biorxiv_client.py --author "Chen" --recent 180
# Get specific paper by DOI
python scripts/biorxiv_client.py --doi "10.1101/2024.05.22.594321"
# Download PDF
python scripts/biorxiv_client.py --doi "10.1101/2024.05.22.594321" --fetch-pdf paper.pdf
Command-Line Options
| Option | Description |
|---|---|
-t, --terms |
Search keywords (multiple allowed) |
-a, --author |
Author name to search |
--doi |
Specific DOI to retrieve |
--since |
Start date (YYYY-MM-DD) |
--until |
End date (YYYY-MM-DD) |
--recent |
Search last N days |
-s, --subject |
Subject category filter |
--fields |
Fields to search: title, abstract, authors |
-o, --out |
Output file (default: stdout) |
--max |
Maximum results to return |
--fetch-pdf |
Download PDF (requires --doi) |
-v, --verbose |
Enable debug output |
Programmatic API
from scripts.biorxiv_client import PreprintClient
client = PreprintClient(debug=True)
# Search by keywords
results = client.find_by_terms(
terms=["enzyme engineering"],
since="2024-01-01",
until="2024-12-31",
subject="biochemistry"
)
# Search by author
papers = client.find_by_author(name="Garcia", since="2023-01-01")
# Get paper by DOI
metadata = client.get_by_doi("10.1101/2024.05.22.594321")
# Download PDF
client.fetch_pdf(doi="10.1101/2024.05.22.594321", destination="paper.pdf")
# Normalize output
formatted = client.normalize(metadata, include_abstract=True)
Subject Categories
| Category | Category |
|---|---|
| animal-behavior-and-cognition | molecular-biology |
| biochemistry | neuroscience |
| bioengineering | paleontology |
| bioinformatics | pathology |
| biophysics | pharmacology-and-toxicology |
| cancer-biology | physiology |
| cell-biology | plant-biology |
| clinical-trials | scientific-communication-and-education |
| developmental-biology | synthetic-biology |
| ecology | systems-biology |
| epidemiology | zoology |
| evolutionary-biology | |
| genetics | |
| genomics | |
| immunology | |
| microbiology |
Response Structure
{
"query": {
"terms": ["protein folding"],
"since": "2024-03-01",
"until": "2024-09-30",
"subject": "biophysics"
},
"count": 87,
"papers": [
{
"doi": "10.1101/2024.05.22.594321",
"title": "Example Preprint Title",
"authors": "Chen L, Patel R, Kim S",
"corresponding_author": "Chen L",
"institution": "Research Institute",
"posted": "2024-05-22",
"revision": "1",
"category": "biophysics",
"license": "cc_by",
"paper_type": "new results",
"abstract": "Abstract content here...",
"pdf_link": "https://www.biorxiv.org/content/10.1101/2024.05.22.594321v1.full.pdf",
"web_link": "https://www.biorxiv.org/content/10.1101/2024.05.22.594321v1",
"journal_ref": ""
}
]
}
Best Practices
| Recommendation | Details |
|---|---|
| Date ranges | Narrow ranges improve response time. Split large queries into chunks. |
| Category filters | Use --subject to reduce bandwidth and improve precision. |
| Rate limiting | Built-in 0.5s delay between requests. Add more for bulk operations. |
| Result caching | Save JSON outputs to avoid redundant API calls. |
| Version awareness | Preprints may have multiple versions. PDF URLs encode version numbers. |
| Error checking | Verify count in outputs. Zero results may indicate date or connectivity issues. |
| Debug mode | Use --verbose for detailed request/response logging. |
Reference Files
| File | Contents |
|---|---|
| api-reference.md | Complete bioRxiv REST API documentation |
| examples.md | Extended code examples and workflow patterns |
Weekly Installs
26
Repository
aminoanalytica/…a-skillsFirst Seen
Feb 25, 2026
Security Audits
Installed on
mcpjam26
claude-code26
replit26
junie26
windsurf26
zencoder26