literature-pdf-ocr-library

Warn

Audited by Snyk on Apr 19, 2026

Risk Level: MEDIUM
Full Analysis

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

  • Third-party content exposure detected (high risk: 0.90). This skill explicitly searches and downloads content from public third‑party sources (arXiv, Semantic Scholar, OpenAlex, Hugging Face) via discover_records/search_* in literature_lib.py and scripts/search_and_download_papers.py, then OCRs and converts those remote PDFs into Markdown (paddleocr_layout_to_markdown.py) which the agent is expected to read/ingest (scripts/build_library_index.py and ingest_literature_library.py), so untrusted external content can directly influence agent behavior.

Issues (1)

W011
MEDIUM

Third-party content exposure detected (indirect prompt injection risk).

Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 19, 2026, 01:26 PM
Issues
1