Fail
Audited by Gen Agent Trust Hub on Feb 15, 2026
Risk Level: HIGHEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [Indirect Prompt Injection] (HIGH): The skill is designed to extract text, tables, and metadata from PDF documents using
pypdfandpdfplumber, which exposes the agent to instructions embedded in untrusted documents. - Ingestion points: PDF file reading occurs in
SKILL.md,scripts/extract_form_field_info.py, andscripts/fill_fillable_fields.py. User-controlled JSON is ingested inscripts/check_bounding_boxes.pyandscripts/fill_pdf_form_with_annotations.py. - Boundary markers: No explicit delimiters or instructions are provided to the agent to ignore or isolate content extracted from the PDFs.
- Capability inventory: The skill can write files (
pypdf,PIL,pdfplumber), perform OCR (pytesseract), and convert documents to images (pdf2image). These capabilities, combined with untrusted input, create a high-risk surface. - Sanitization: Extracted text and data are processed and outputted to the agent without sanitization or escaping.
- [Unverifiable Dependencies] (MEDIUM): Documentation in
SKILL.mdandocr.mdencourages the installation of multiple third-party Python packages (pytesseract,pdf2image,pdfplumber) and system-level binaries (tesseract-ocr,poppler-utils). While these are common tools, they expand the attack surface and rely on external maintainers. - [Dynamic Execution] (MEDIUM): The script
scripts/fill_fillable_fields.pyperforms runtime monkeypatching of thepypdflibrary (DictionaryObject.get_inherited). While intended to fix a specific bug (pypdf #2084), modifying library behavior at runtime can lead to unpredictable side effects or be exploited if the patch logic is flawed. - [Privilege Escalation] (LOW): The
ocr.mdfile suggests usingsudo apt-getfor dependency installation. While common for setup, instructions for AI agents involving elevated privileges should be handled with caution.
Recommendations
- AI detected serious security threats
Audit Metadata