bib-search-citation
Bib Search Citation
Overview
Use this skill when the user provides a .bib file and wants research-oriented retrieval rather than just a single citation key lookup. This skill is designed for large bibliographies with mixed standard and custom fields, including fields such as shorttitle, annotation, keywords, abstract, and file.
Follow this workflow:
- Identify the
.bibfile to use. - If
rtkis available, prefer it for exploratory steps such as locating.bibfiles and inspecting representative fields. - Translate the user's request into either a JSON search spec or a compact query expression.
- Run
scripts/search_bib.pyon the.bibfile and keep its JSON output uncompressed. - Optionally pipe the JSON into
scripts/preview_bib_search.pyfor a compact human-readable summary. - Review the results and present the best matches clearly.
- Include LaTeX and/or Typst citation snippets whenever the user asks for them or would benefit from them.
Input expectations
The typical input is:
- one
.bibfile provided by the user - a natural-language research query
- optional structured filters such as year range, entry type, author, DOI presence, code availability, or custom field matches
- optional compact filters such as
author:cheng year>=2024 has:code type:article - optional output preferences such as
latex,typst,both, or raw BibTeX
If the user gives a natural-language request only, infer a reasonable search spec and say what assumptions you made. If the user writes a compact filter expression directly, preserve it as closely as possible instead of converting it into vague prose.
Search planning
Before running the script, map the request into a search spec.
Common spec fields
query: free-text topic queryfilters.year_min,filters.year_max,filters.years_in,filters.exclude_yearsfilters.author_contains,filters.author_excludesfilters.type_in,filters.exclude_type_infilters.has,filters.exclude_hasfilters.field_contains,filters.field_excludessort:relevance,year_desc,year_asc, ortitlelimit: default 5 unless the user asks for morereturn_fields: fields to expose in the answerinclude_raw_bib:truewhen the user asks for the original entry or when exact export matterscitation_mode:latex,typst,both, ornone
Heuristics for natural-language requests
Use these defaults unless the user says otherwise:
- research discovery request ->
sort: relevance - no explicit limit ->
limit: 5 - no explicit field list -> return the research-oriented default fields:
key,title,shorttitle,author,year,venue,doi,eprint,keywords,annotation,abstract - asks for "original", "full entry", or "bib" ->
include_raw_bib: true - asks for both LaTeX and Typst, or just says "citation" in a mixed writing workflow ->
citation_mode: both
Compact query language
The script can parse direct query expressions inside --query, and it can also parse them when they appear inside spec.query.
Supported compact operators:
author:chengyear>=2024year:2024oryear:2023,2024type:article,misc-type:mischas:code,doi-has:pdfannotation:CodeAvailablekeywords:mambasort:year_desclimit:10fields:key,title,year,doicite:latex,cite:typst, orcite:bothraw:true
Unstructured tokens that do not match the compact syntax remain part of the topic query.
Supported has values
The script supports these useful has values:
doiabstractkeywordsannotationshorttitleeprintpdfcode
code is inferred from fields such as url, abstract, keywords, annotation, note, or howpublished that mention GitHub, GitLab, code, repository, or source.
For more examples, see references/query-syntax.md.
Running the script
Run the script with a JSON spec, a spec file, or a compact query.
RTK Fast Path
If rtk is available, prefer it only for model-facing exploration:
- locate bibliography files with
rtk find . -name "*.bib" - inspect a representative slice with
rtk read /path/to/library.bib -l aggressive -m 80 - confirm whether fields such as DOI, keywords, annotation, or eprint are present with
rtk grep "doi|keywords|annotation|eprint" /path/to/library.bib
Keep machine-readable search results on the raw script path:
- use raw
python scripts/search_bib.py ...whenever another tool or script needs JSON - do not wrap
search_bib.pyoutput with RTK compression - use
python scripts/preview_bib_search.pyonly after JSON has already been produced
Inline JSON example
python scripts/search_bib.py \
--bib /path/to/library.bib \
--spec-json '{
"query": "mamba time series forecasting author:Cheng year>=2024 has:code",
"sort": "relevance",
"limit": 5,
"citation_mode": "both",
"include_raw_bib": false
}'
Compact query example
python scripts/search_bib.py \
--bib /path/to/library.bib \
--query 'mamba time series forecasting author:Cheng year>=2024 has:code type:article,misc cite:both limit:5'
Spec file example
python scripts/search_bib.py --bib /path/to/library.bib --spec-file /path/to/spec.json
Human-readable preview example
python scripts/search_bib.py \
--bib /path/to/library.bib \
--query 'mamba time series forecasting author:Cheng year>=2024 has:code type:article,misc cite:both limit:5' \
| python scripts/preview_bib_search.py
If the user uploads a .bib file into the conversation, first make sure you know its local path in the execution environment, then run the script against that file.
Output expectations
When presenting results to the user, prefer this order:
- brief summary of how many strong matches were found
- top matches with the requested research fields
- citation snippets in the requested format
- raw BibTeX only when requested or materially useful
For each selected entry, usually include:
- citation key
- title and optional shorttitle
- authors
- year and venue
- DOI and/or eprint when present
- the most relevant supporting fields for the query, such as keywords, annotation, or a short abstract excerpt
If the user asked for a compact query, it is helpful to echo the interpreted filters briefly, especially when negation or multiple field filters are involved.
When using the preview helper:
- treat it as a compact rendering of the JSON, not as a separate search engine
- keep
search_bib.pyas the source of truth for filtering, scoring, sorting, and citations - do not rely on the preview output when exact raw BibTeX preservation matters
Citation formatting rules
LaTeX
When citation_mode includes latex, expose:
\\cite{key}\\parencite{key}\\textcite{key}
These are intended for biblatex workflows. If the user only wants one form, show only that form.
Typst
When citation_mode includes typst, expose:
@keywhen the key is simple enough for shorthand usage#cite(<key>)when shorthand is fine#cite(label("key"))when the key contains characters that make shorthand fragile
If the script reports typst.needs_label = true, prefer the explicit label("...") form instead of shorthand.
Result quality checks
Before answering:
- make sure the returned entries satisfy the user's explicit filters
- do not overclaim topic relevance; if results are only approximate, say so
- when several entries are similar, explain the difference briefly
- preserve raw BibTeX exactly when quoting the original entry
Error handling
Parse errors
If the .bib file contains malformed entries (unbalanced braces, encoding issues, truncated fields), the script skips those entries silently and processes the rest. When a file fails to parse entirely, check the encoding (the script assumes UTF-8) and look for obvious structural corruption such as missing closing braces.
Empty result sets
When zero entries match, suggest broadening the search:
- remove
has:constraints (e.g.has:codeexcludes many entries) - widen the year range or drop it entirely
- use fewer or shorter topic keywords
- check author name spelling or try partial matches
Large file performance
The script is pure Python with a linear scan and no external dependencies. For typical academic libraries (up to ~10,000 entries) it completes in seconds. For very large files (50,000+ entries), expect proportionally longer runtimes but no functional issues.
Resources
scripts/search_bib.py: parses.bibfiles, applies filters, ranks results, and formats citation snippetsscripts/preview_bib_search.py: renderssearch_bib.pyJSON into a compact human-readable summaryreferences/query-syntax.md: examples for mapping user requests into structured search specs and compact expressions
More from bahayonghang/academic-writing-skills
latex-paper-en
English LaTeX academic paper assistant for existing `.tex` journal or conference manuscripts (IEEE, ACM, Springer, NeurIPS, ICML, etc.). Use whenever the user wants compile/build diagnosis, venue formatting, BibTeX/Biber or citation checks, grammar, sentence-length, logic, abstract, title, figure, table, pseudocode, experiment-section, de-AI, translation, expression, related-work synthesis, research-gap derivation, journal adaptation, or submission-readiness help on a LaTeX paper. Trigger for prompts like proofread my paper, fix my LaTeX, check manuscript, improve writing, rewrite related work, research gap, booktabs/three-line table, Algorithm 1, algorithm2e, algorithmicx, algpseudocodex, Require/Ensure, or 换投 when a .tex project is present.
1.3Klatex-thesis-zh
Chinese LaTeX thesis assistant for existing .tex degree thesis projects (XeLaTeX/LuaLaTeX/latexmk). Use this skill whenever a user works on a Chinese master's or doctoral thesis needing compilation, GB/T 7714 bibliography checks, chapter structure mapping, template detection (thuthesis, pkuthss), terminology consistency, logic coherence review, literature review restructuring, related-work synthesis, research-gap derivation, heading lead-in checks, title optimization, de-AI editing, experiment chapter review, three-line table validation, or abstract structure diagnosis. Trigger even for single issues like "帮我编译论文", "检查国标格式", "看看绪论逻辑", "文献综述太像罗列", "研究空白没推出来", "毕业论文", "学位论文", "硕士/博士论文", "三线表", "检查摘要", or "摘要结构".
1.3Kpaper-audit
Deep-review-first audit for Chinese and English academic papers across LaTeX, Typst, and PDF formats. Use whenever the user wants reviewer-style paper critique, pre-submission readiness checks, pass/fail gate decisions, structured revision roadmaps, journal-style peer review reports, or re-audits of revised manuscripts. Trigger even if the user only says "review my paper", "check if this is ready to submit", "audit this PDF", "simulate peer review", "write a SCI review report", "give me Summary / Major Issues / Minor Issues / Recommendation", "find the biggest problems in this manuscript", or "re-check whether I fixed the review issues". Do not use for direct source editing or compilation-heavy repair; route those to the format-specific writing skills instead.
608typst-paper
Typst academic paper assistant for existing `.typ` manuscripts in English or Chinese. Use whenever the user wants Typst compile/export diagnosis, venue formatting, BibTeX or Hayagriva bibliography checks, grammar, sentence-length, logic, abstract, title, table, pseudocode, experiment-section, de-AI, translation, expression, related-work synthesis, research-gap derivation, journal adaptation, or submission-readiness help on a Typst paper. Trigger even if the prompt mentions only a .typ file, typst compile/export error, typst bibliography, one section rewrite, rewrite related work, research gap, three-line table, check abstract, reformat for another journal, algorithm-figure, lovelace, or algorithmic.
362industrial-ai-research
>
274