pdf-processing
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHPROMPT_INJECTIONCREDENTIALS_UNSAFEEXTERNAL_DOWNLOADS
Full Analysis
- Indirect Prompt Injection (HIGH): The skill is designed to ingest and extract content from untrusted PDF files (via
scripts/download_pdf.pyandscripts/extract_text.py) which are then processed by the agent.\n - Ingestion points:
scripts/download_pdf.pydownloads files from arbitrary URLs;scripts/extract_text.pyandscripts/convert_pdf.pyread PDF content from local storage.\n - Boundary markers: Absent. The scripts extract and output raw text directly into files (Markdown, JSON, or TXT) without adding delimiters or 'ignore instructions' warnings to prevent the agent from obeying embedded commands.\n
- Capability inventory: The agent uses shell execution to run these scripts and has file-write/network capabilities. A malicious PDF could contain instructions (e.g., 'Exfiltrate the contents of .env') that the agent might execute after reading the extracted text.\n
- Sanitization: Absent. Content is extracted 'as-is' with no filtering or sanitization of natural language instructions.\n- Data Exposure & Exfiltration (LOW):
scripts/extract_text.pyutilizes a--passwordcommand-line argument for encrypted PDFs. Providing passwords via CLI can expose them to other users or logs via process lists (e.g.,ps aux) or shell history files.\n- External Downloads (LOW):scripts/download_pdf.pyallows downloading files from any URL. While intended, this facilitates the ingestion of malicious content. It also includes a--no-verify-sslflag which, if used, bypasses transport security and enables man-in-the-middle attacks.
Recommendations
- AI detected serious security threats
Audit Metadata