literature-pdf-ocr-library

Pass

Audited by Gen Agent Trust Hub on Apr 19, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/ingest_literature_library.py uses subprocess.run() with a list of arguments to orchestrate the internal workflow by executing other Python scripts within the skill directory, which is a safe practice that prevents shell injection.
  • [EXTERNAL_DOWNLOADS]: The skill fetches metadata and PDF documents from established academic sources including arxiv.org, api.semanticscholar.org, api.openalex.org, and huggingface.co.
  • [EXTERNAL_DOWNLOADS]: The scripts/paddleocr_layout_to_markdown.py script sends PDF data to an external API at aistudio-app.com for layout analysis and OCR processing. This is a legitimate data flow associated with the skill's primary purpose.
  • [PROMPT_INJECTION]: The skill ingests untrusted external content from scientific papers. This introduces a surface for indirect prompt injection, where malicious instructions embedded in a paper's text could potentially influence the agent when it reads the converted Markdown output.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 19, 2026, 01:26 PM