crossref-doi
Crossref DOI Skill
Resolves DOIs and searches scholarly publication metadata via the Crossref REST API (api.crossref.org). Covers 140M+ DOI-registered works including journal articles, conference proceedings, books, datasets, and reports.
Authentication
No API key required. Add mailto=user@example.com to all requests
for access to the polite pool (faster rate limits, priority routing).
Resolution order for the mailto address:
~/.config/crossref/credentials— parsemailto=user@example.comCROSSREF_MAILTOenv var — fallback- Omit — still works but hits the public pool (slower, rate-limited)
Reading the credentials file (bash):
MAILTO=$(grep '^mailto=' ~/.config/crossref/credentials 2>/dev/null | cut -d= -f2)
[ -z "$MAILTO" ] && MAILTO="${CROSSREF_MAILTO}"
This is not a secret credential — it is a contact email sent in the clear so Crossref can reach you if your usage is problematic. Any valid email works.
API Structure
Base URL: https://api.crossref.org/
Core Endpoints
| Endpoint | Description |
|---|---|
works/{doi} |
Get metadata for a specific DOI |
works?query=... |
Full-text search across all metadata fields |
works?query.bibliographic=... |
Bibliographic-style search (title + author + year) |
works?query.title=... |
Title-only search |
works?query.author=... |
Author-only search |
types |
List all work types (journal-article, proceedings-article, etc.) |
journals/{issn} |
Journal metadata by ISSN |
journals/{issn}/works |
Works published in a specific journal |
prefixes/{prefix} |
Publisher info by DOI prefix |
members/{id} |
Publisher/member metadata |
funders/{id} |
Funder metadata |
funders/{id}/works |
Works funded by a specific funder |
Query Parameters
| Parameter | Example | Notes |
|---|---|---|
query |
query=lithium+brine |
Full-text search across all fields |
query.bibliographic |
query.bibliographic=Smith+2023+lithium |
Simulates bibliographic lookup |
query.title |
query.title=direct+lithium+extraction |
Title search only |
query.author |
query.author=Stringfellow |
Author name search |
filter |
filter=from-pub-date:2023-01-01 |
Comma-separated filter list |
select |
select=DOI,title,author,type |
Return only specified fields (reduces payload) |
rows |
rows=20 |
Results per page (default 20, max 1000) |
offset |
offset=20 |
Pagination offset (max 10000) |
cursor |
cursor=* |
Deep paging cursor (use for offset beyond 10000) |
sort |
sort=relevance |
Sort field |
order |
order=desc |
Sort direction |
mailto |
mailto=you@example.com |
Polite pool access (faster rate limits) |
Filter Parameters
Filters are passed as a comma-separated list in the filter parameter:
| Filter | Example | Description |
|---|---|---|
from-pub-date |
from-pub-date:2023-01-01 |
Published on or after date |
until-pub-date |
until-pub-date:2024-12-31 |
Published on or before date |
type |
type:journal-article |
Work type |
has-abstract |
has-abstract:true |
Only works with abstracts |
has-orcid |
has-orcid:true |
Authors with ORCID IDs |
has-references |
has-references:true |
Works with reference lists |
has-full-text |
has-full-text:true |
Works with full-text links |
is-update |
is-update:false |
Exclude corrections/errata |
funder |
funder:10.13039/100000015 |
DOE funder ID |
container-title |
container-title:Water+Resources+Research |
Journal name |
doi |
doi:10.1021/acs.est.2c03513 |
Exact DOI match |
issn |
issn:0043-1397 |
Filter by journal ISSN |
Common Work Types
| Type | Description |
|---|---|
journal-article |
Peer-reviewed journal article |
proceedings-article |
Conference paper |
book-chapter |
Chapter in an edited volume |
dataset |
Data publication |
report |
Technical report |
posted-content |
Preprint |
monograph |
Book/monograph |
Sort Fields
| Sort | Description |
|---|---|
relevance |
Default — Crossref relevance score |
published |
Publication date |
indexed |
Date indexed in Crossref |
is-referenced-by-count |
Citation count (descending = most cited) |
references-count |
Number of references in bibliography |
score |
Internal relevance score |
Workflow
Step 1 -- Resolve Intent
Determine the operation type:
| User Intent | Endpoint | Key Parameters |
|---|---|---|
| "Look up DOI 10.xxxx/yyyy" | works/{doi} |
Direct DOI lookup |
| "Find papers about X" | works?query=X |
Keyword search |
| "Papers by Author on X" | works?query=X&query.author=Author |
Combined search |
| "Recent journal articles on X" | works?query=X&filter=type:journal-article,from-pub-date:YYYY |
Filtered search |
| "Most cited papers on X" | works?query=X&sort=is-referenced-by-count&order=desc |
Sorted search |
| "Resolve DOI from USGS/OSTI" | works/{doi} |
Cross-skill DOI resolution |
Step 2 -- Build and Execute Request
Single DOI lookup:
DOI="10.2118/1011-0046-jpt"
MAILTO="user@example.com"
curl -s "https://api.crossref.org/works/${DOI}?mailto=${MAILTO}"
Keyword search with filters:
curl -s "https://api.crossref.org/works?query=lithium+produced+water+brine\
&filter=from-pub-date:2020-01-01,has-abstract:true,type:journal-article\
&rows=20\
&select=DOI,title,author,type,published-print,container-title,is-referenced-by-count,abstract\
&sort=relevance\
&mailto=user@example.com"
Bibliographic search (best for matching a citation string):
curl -s "https://api.crossref.org/works?query.bibliographic=Stringfellow+2014+lithium+oilfield+brine\
&rows=5\
&select=DOI,title,author,published-print,container-title\
&mailto=user@example.com"
Step 3 -- Parse Response
Single work response (works/{doi}):
{
"status": "ok",
"message-type": "work",
"message": {
"DOI": "10.2118/1011-0046-jpt",
"title": ["Brine Management: Produced Water and Frac Flowback Brine"],
"author": [{"given": "David", "family": "Burnett", "affiliation": [...]}],
"container-title": ["Journal of Petroleum Technology"],
"type": "journal-article",
"published-print": {"date-parts": [[2011, 10, 1]]},
"is-referenced-by-count": 1,
"abstract": "...",
"link": [{"URL": "...", "content-type": "...", "intended-application": "..."}]
}
}
Search response (works?query=...):
{
"status": "ok",
"message-type": "work-list",
"message": {
"total-results": 2171593,
"items-per-page": 20,
"query": {"start-index": 0, "search-terms": "lithium produced water brine"},
"items": [
{ "DOI": "...", "title": [...], "author": [...], ... }
],
"next-cursor": "AoJ/..."
}
}
Key fields to extract per work:
| Field | Type | Notes |
|---|---|---|
DOI |
string | The DOI |
title |
array of strings | Usually one element |
author |
array of objects | Each has given, family, affiliation, optional ORCID |
container-title |
array of strings | Journal or proceedings name |
type |
string | journal-article, proceedings-article, etc. |
published-print |
object | date-parts: [[year, month, day]] |
published-online |
object | Same structure as published-print |
is-referenced-by-count |
integer | Citation count (Crossref-tracked) |
abstract |
string | May contain JATS XML markup |
link |
array of objects | Full-text links with content-type |
reference |
array of objects | Bibliography entries (if has-references) |
funder |
array of objects | Funding sources with name, DOI, award |
subject |
array of strings | Subject categories |
Step 4 -- Format Dates
The date-parts field uses variable-length arrays:
[[2024, 3, 15]]= March 15, 2024[[2024, 3]]= March 2024[[2024]]= 2024
Handle all three cases. Use the most specific date available.
Step 5 -- Clean Abstract Text
Abstracts may contain JATS XML tags (<jats:p>, <jats:italic>, etc.).
Strip XML tags before display:
echo "$ABSTRACT" | sed 's/<[^>]*>//g'
Step 6 -- Produce Output
Format: Metadata Table + Narrative
For a single DOI lookup:
## DOI: 10.2118/1011-0046-jpt
| Field | Value |
|-------|-------|
| Title | Brine Management: Produced Water and Frac Flowback Brine |
| Authors | David Burnett (Texas A&M University) |
| Journal | Journal of Petroleum Technology |
| Date | October 2011 |
| Type | journal-article |
| Citations | 1 |
| DOI Link | https://doi.org/10.2118/1011-0046-jpt |
**Abstract:** R&D Grand Challenges - This is the third in a series...
For a search:
## Crossref Search: "lithium produced water brine" (2020--present, journal articles)
Found 847 results. Showing top 10 by relevance:
| # | Year | Title | Journal | Citations | DOI |
|---|------|-------|---------|-----------|-----|
| 1 | 2023 | Direct Lithium Extraction from... | Minerals | 8 | 10.3390/... |
| 2 | 2024 | Effect of buffer on direct... | New J Chem | 3 | 10.1039/... |
| ... | ... | ... | ... | ... | ... |
**Summary:** Search returned 847 journal articles published since 2020
matching "lithium produced water brine". The most-cited result (8 citations)
covers DLE technology assessment for seawater brine. Results span journals
including Minerals, Water Research, and ACS ES&T...
Pagination
Offset-based (for small result sets, offset max 10000):
&offset=0&rows=20 (page 1)
&offset=20&rows=20 (page 2)
Cursor-based (for deep paging beyond 10000):
- First request:
&cursor=*&rows=100 - Response includes
next-cursorin message - Subsequent request:
&cursor=<next-cursor-value>&rows=100 - Continue until
itemsis empty
Warn the user if total results exceed 1000 and suggest narrowing filters.
Error Handling
| HTTP Code | Meaning | Action |
|---|---|---|
| 200 | Success | Parse normally |
| 404 | DOI not found | Report DOI not in Crossref; suggest checking doi.org directly |
| 400 | Bad request | Check filter syntax, invalid field names |
| 429 | Rate limited | Wait and retry; suggest adding mailto for polite pool |
| 503 | Service unavailable | Retry after 5 seconds |
Common issues:
- DOI contains special characters: URL-encode the DOI (
/stays as-is in path) - USGS data DOIs (e.g.,
10.5066/...) may not be in Crossref — they use DataCite. If a10.5066DOI returns 404, suggest queryinghttps://api.datacite.org/dois/{doi}instead. - Abstract contains XML: strip JATS tags before display
Cross-Skill Integration
This skill complements other PNGE data skills:
| Source Skill | How to Use Crossref |
|---|---|
usgs-produced-waters |
Resolve the dataset DOI (10.5066/P9DSRCZJ) — note: USGS DOIs may be DataCite |
usgs-minerals |
Look up publication DOIs from MCS data releases |
usgs-pubs |
Resolve DOIs returned by USGS Publications Warehouse |
doe-osti |
Resolve DOIs from OSTI technical report records |
netl-edx |
Resolve DOIs linked in EDX dataset metadata |
Workflow for DOI resolution from other skills:
- Extract DOI from the source API response
- Call
works/{doi}to get full citation metadata - If 404 and DOI prefix is
10.5066or10.2172, try DataCite instead - Present unified citation with DOI link
PNGE Research Queries
Pre-built searches for common lithium/magnesium research topics:
Lithium from produced water:
curl -s "https://api.crossref.org/works?\
query=lithium+extraction+produced+water+oilfield+brine\
&filter=from-pub-date:2020-01-01,has-abstract:true,type:journal-article\
&rows=20&sort=is-referenced-by-count&order=desc\
&select=DOI,title,author,published-print,container-title,is-referenced-by-count\
&mailto=user@example.com"
Direct lithium extraction (DLE) technology:
curl -s "https://api.crossref.org/works?\
query=direct+lithium+extraction+DLE+brine\
&filter=from-pub-date:2020-01-01,type:journal-article\
&rows=20&sort=is-referenced-by-count&order=desc\
&select=DOI,title,author,published-print,container-title,is-referenced-by-count\
&mailto=user@example.com"
Magnesium recovery from brine:
curl -s "https://api.crossref.org/works?\
query=magnesium+recovery+brine+produced+water\
&filter=from-pub-date:2018-01-01,type:journal-article\
&rows=20&sort=is-referenced-by-count&order=desc\
&select=DOI,title,author,published-print,container-title,is-referenced-by-count\
&mailto=user@example.com"
Appalachian Basin produced water geochemistry:
curl -s "https://api.crossref.org/works?\
query=Appalachian+Marcellus+produced+water+geochemistry\
&filter=from-pub-date:2015-01-01,type:journal-article\
&rows=20&sort=is-referenced-by-count&order=desc\
&select=DOI,title,author,published-print,container-title,is-referenced-by-count\
&mailto=user@example.com"
Caveats and Limitations
- Citation counts (
is-referenced-by-count) are Crossref-tracked only. They undercount compared to Google Scholar or Scopus because they only include references deposited by publishers who participate in Crossref Cited-by. - Coverage gaps: Not all publishers deposit abstracts or references. Some older works have minimal metadata. Conference papers vary widely in coverage.
- USGS/DOE data DOIs (prefixes
10.5066,10.2172) are registered with DataCite, not Crossref. A 404 for these prefixes is expected — use DataCite API. - Relevance ranking is opaque. For precise results, use
query.bibliographicwith specific author + year + title fragments rather than broadquery. - Rate limits: Public pool = ~50 req/sec shared. Polite pool (with mailto) =
significantly faster. Always include
mailtoin production use. - Abstract XML: Raw abstracts contain JATS markup. Always strip tags.
Implementation Notes
- Use
bash_toolwithcurl+jqfor API calls - Always include
mailtoparameter for polite pool access - Use
selectto minimize payload size when you only need specific fields - Default to
rows=20for search results; increase torows=100if user needs more - For DOI lookups, URL-encode the DOI but keep
/as-is in the path - See
references/api_reference.mdfor full endpoint documentation - See
references/python_example.pyfor a stdlib-only Python client