pdf

Fail

Audited by Gen Agent Trust Hub on Feb 15, 2026

Risk Level: HIGHEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [Indirect Prompt Injection] (HIGH): The skill is designed to extract text, tables, and metadata from PDF documents using pypdf and pdfplumber, which exposes the agent to instructions embedded in untrusted documents.
  • Ingestion points: PDF file reading occurs in SKILL.md, scripts/extract_form_field_info.py, and scripts/fill_fillable_fields.py. User-controlled JSON is ingested in scripts/check_bounding_boxes.py and scripts/fill_pdf_form_with_annotations.py.
  • Boundary markers: No explicit delimiters or instructions are provided to the agent to ignore or isolate content extracted from the PDFs.
  • Capability inventory: The skill can write files (pypdf, PIL, pdfplumber), perform OCR (pytesseract), and convert documents to images (pdf2image). These capabilities, combined with untrusted input, create a high-risk surface.
  • Sanitization: Extracted text and data are processed and outputted to the agent without sanitization or escaping.
  • [Unverifiable Dependencies] (MEDIUM): Documentation in SKILL.md and ocr.md encourages the installation of multiple third-party Python packages (pytesseract, pdf2image, pdfplumber) and system-level binaries (tesseract-ocr, poppler-utils). While these are common tools, they expand the attack surface and rely on external maintainers.
  • [Dynamic Execution] (MEDIUM): The script scripts/fill_fillable_fields.py performs runtime monkeypatching of the pypdf library (DictionaryObject.get_inherited). While intended to fix a specific bug (pypdf #2084), modifying library behavior at runtime can lead to unpredictable side effects or be exploited if the patch logic is flawed.
  • [Privilege Escalation] (LOW): The ocr.md file suggests using sudo apt-get for dependency installation. While common for setup, instructions for AI agents involving elevated privileges should be handled with caution.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 15, 2026, 11:47 PM