pdf-toolkit

Pass

Audited by Gen Agent Trust Hub on Mar 7, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The script scripts/ocr-pdf.ts uses the tesseract.js library. By default, this library fetches OCR workers and language data (e.g., traineddata files) from external CDNs like unpkg.com or jsdelivr.net at runtime.
  • [EXTERNAL_DOWNLOADS]: Several core dependencies used in the scripts are missing from the package.json manifest. Specifically, pdf-lib (used in create-pdf.ts, merge-pdf.ts, and split-pdf.ts) and tesseract.js (used in ocr-pdf.ts) are not declared as dependencies. This results in an unverifiable environment where scripts rely on pre-existing global packages.
  • [PROMPT_INJECTION]: The skill is vulnerable to Indirect Prompt Injection (Category 8).
  • Ingestion points: scripts/extract-text.ts, scripts/ocr-pdf.ts, and scripts/extract-tables.ts ingest untrusted data from PDF files provided by the user or external sources.
  • Boundary markers: The skill does not use any delimiters (e.g., XML tags or triple quotes) or 'ignore' instructions when outputting extracted text to the agent.
  • Capability inventory: The skill possesses extensive file-writing capabilities across multiple scripts (create-pdf.ts, merge-pdf.ts, split-pdf.ts, extract-images.ts) and facilitates the execution of local TypeScript files via bun run.
  • Sanitization: No sanitization or filtering of extracted text is performed; the content is passed directly from the document to the agent's context, allowing an attacker to embed malicious instructions inside a PDF that the agent might execute.
  • [COMMAND_EXECUTION]: The skill is designed to execute local filesystem operations and generate new files based on user-controlled input paths and glob patterns in scripts/create-pdf.ts.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 7, 2026, 11:10 PM