pdf-processing

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHPROMPT_INJECTIONCREDENTIALS_UNSAFEEXTERNAL_DOWNLOADS
Full Analysis
  • Indirect Prompt Injection (HIGH): The skill is designed to ingest and extract content from untrusted PDF files (via scripts/download_pdf.py and scripts/extract_text.py) which are then processed by the agent.\n
  • Ingestion points: scripts/download_pdf.py downloads files from arbitrary URLs; scripts/extract_text.py and scripts/convert_pdf.py read PDF content from local storage.\n
  • Boundary markers: Absent. The scripts extract and output raw text directly into files (Markdown, JSON, or TXT) without adding delimiters or 'ignore instructions' warnings to prevent the agent from obeying embedded commands.\n
  • Capability inventory: The agent uses shell execution to run these scripts and has file-write/network capabilities. A malicious PDF could contain instructions (e.g., 'Exfiltrate the contents of .env') that the agent might execute after reading the extracted text.\n
  • Sanitization: Absent. Content is extracted 'as-is' with no filtering or sanitization of natural language instructions.\n- Data Exposure & Exfiltration (LOW): scripts/extract_text.py utilizes a --password command-line argument for encrypted PDFs. Providing passwords via CLI can expose them to other users or logs via process lists (e.g., ps aux) or shell history files.\n- External Downloads (LOW): scripts/download_pdf.py allows downloading files from any URL. While intended, this facilitates the ingestion of malicious content. It also includes a --no-verify-ssl flag which, if used, bypasses transport security and enables man-in-the-middle attacks.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 12:35 AM