The Agent Skills Directory

Indirect Prompt Injection (HIGH): The skill is designed to ingest and process untrusted PDF documents (e.g., in scripts/extract_form_field_info.py and SKILL.md). PDF content can contain malicious instructions meant to subvert the agent's logic during extraction, analysis, or form-filling tasks. The skill has high capabilities, including writing files and modifying documents, making this a significant risk surface.
Ingestion points: pypdf.PdfReader and pdfplumber.open are used across multiple scripts and documentation examples to read external files.
Boundary markers: While forms.md provides a structured workflow for the agent to follow, no explicit delimiters or 'ignore embedded instructions' warnings are enforced on the extracted text content before it is processed by the agent.
Capability inventory: The skill can write files (pypdf.PdfWriter), generate images (Pillow), and encourages the use of system binaries like qpdf and pdftk.
Sanitization: No sanitization or validation of the content extracted from PDFs is performed before it is used in further agent reasoning or file-writing operations.
Dynamic Execution (MEDIUM): The script scripts/fill_fillable_fields.py performs runtime monkeypatching of the pypdf library (DictionaryObject.get_inherited). While intended as a bug fix for selection lists, runtime modification of library code is a dynamic execution technique that increases the complexity of the security profile.
Unverifiable Dependencies (MEDIUM): The skill relies on several external Python packages and system-level binaries (e.g., poppler-utils, qpdf, pdftk, tesseract-ocr) that are not within the trusted source scope. This expands the attack surface to include potential vulnerabilities in these complex file-parsing libraries.
Command Execution (LOW): SKILL.md documents the use of command-line tools for PDF manipulation. While the scripts primarily use libraries, the instructions guide the agent to use shell commands, which poses a risk of command injection if input filenames or parameters are derived from untrusted sources without proper escaping.

pdf