The Agent Skills Directory

[PROMPT_INJECTION]: The skill is designed to extract content from PDF documents, which constitutes an indirect prompt injection surface. Content within a processed PDF could contain malicious instructions intended to bypass agent safeguards or influence downstream actions.
Ingestion points: scripts/extract-text.ts, scripts/extract-tables.ts, and scripts/ocr-pdf.ts read and output text from user-provided PDF files.
Boundary markers: The scripts do not encapsulate extracted content in delimiters or provide explicit instructions to the agent to treat the data as untrusted.
Capability inventory: The skill provides tools for file system manipulation and script execution via the Bun runtime.
Sanitization: No sanitization or filtering is performed on the text extracted from the PDF files.
[EXTERNAL_DOWNLOADS]: The scripts/ocr-pdf.ts script utilizes the tesseract.js library for OCR tasks. This library dynamically fetches its OCR worker and language models from public CDNs like unpkg.com during runtime. This is a standard and well-documented behavior for the library, utilizing established technology infrastructure.

pdf-toolkit