fetch-meta-to-kb
SKILL.md
Fetch Meta to KB
Core Goal
- Pull
journal-articlerecords from Crossref after a given--from-date. - Read ISSN seed rows from
journals_issn(journal,issn1). - Insert rows into
journalswithON CONFLICT (doi) DO NOTHING. - Keep the implementation aligned with
fetch_meta_to_kb.py.
Run Workflow
- Set database connection env vars (user-managed keys prefixed with
KB_):
KB_DB_HOSTKB_DB_PORTKB_DB_NAMEKB_DB_USERKB_DB_PASSWORDKB_LOG_DIR(required, log output directory)
- Run incremental fetch with a required date:
python3 scripts/fetch_meta_to_kb.py --from-date 2024-05-01
- If executing through an
exectool call, set timeout to 1800 seconds (30 minutes).
- Check logs in:
${KB_LOG_DIR}/fetch-meta-to-kb-YYYYMMDD-HHMMSS.log(UTC timestamp, one file per run)
- Build user-facing summary strictly from the current run output:
- Prefer
RUN_SUMMARY_JSONemitted byfetch_meta_to_kb.py. - If JSON is unavailable, parse only this run's
${KB_LOG_DIR}/fetch-meta-to-kb-YYYYMMDD-HHMMSS.log. total_insertedmust mean rows inserted in this run (after DOI dedup), not cumulative rows in table.
Behavior Contract
- Query Crossref endpoint:
https://api.crossref.org/journals/{issn}/works. - Filter with
type:journal-article,from-pub-date:<from-date>. - Keep only items whose
container-titleequals target journal title (case-insensitive). - Continue pagination with cursor until no matching items remain.
- Store fields in
journals:title,doi,journal,authors,date,abstract(nullable when Crossref has no abstract). - Reporting/announcement metrics must use current-run log/summary only.
- Do not compute announcement counts via database-wide or time-window SQL such as
WHERE date >= ....
Scope Boundary
- Implement only Crossref incremental fetch + insert into
journals.
Script
scripts/fetch_meta_to_kb.py
Weekly Installs
12
Repository
tiangong-ai/skillsGitHub Stars
4
First Seen
9 days ago
Security Audits
Installed on
openclaw12
github-copilot12
codex12
cline12
opencode12
gemini-cli11