The Agent Skills Directory

Indirect Prompt Injection (HIGH): The skill is designed to ingest and extract content from untrusted PDF files (via scripts/download_pdf.py and scripts/extract_text.py) which are then processed by the agent.\n
Ingestion points: scripts/download_pdf.py downloads files from arbitrary URLs; scripts/extract_text.py and scripts/convert_pdf.py read PDF content from local storage.\n
Boundary markers: Absent. The scripts extract and output raw text directly into files (Markdown, JSON, or TXT) without adding delimiters or 'ignore instructions' warnings to prevent the agent from obeying embedded commands.\n
Capability inventory: The agent uses shell execution to run these scripts and has file-write/network capabilities. A malicious PDF could contain instructions (e.g., 'Exfiltrate the contents of .env') that the agent might execute after reading the extracted text.\n
Sanitization: Absent. Content is extracted 'as-is' with no filtering or sanitization of natural language instructions.\n- Data Exposure & Exfiltration (LOW): scripts/extract_text.py utilizes a --password command-line argument for encrypted PDFs. Providing passwords via CLI can expose them to other users or logs via process lists (e.g., ps aux) or shell history files.\n- External Downloads (LOW): scripts/download_pdf.py allows downloading files from any URL. While intended, this facilitates the ingestion of malicious content. It also includes a --no-verify-ssl flag which, if used, bypasses transport security and enables man-in-the-middle attacks.

pdf-processing