The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: Installs several standard Python libraries for document processing (pdfplumber, python-docx, beautifulsoup4, lxml, openpyxl, pandas, pytesseract, Pillow). These are widely used and appropriate for the skill's purpose.
[COMMAND_EXECUTION]: Uses the system file command to identify document formats. This is a common and safe utility for file type detection. It also executes pip install to manage dependencies.
[PROMPT_INJECTION]: The skill processes untrusted external content from various document formats (PDF, DOCX, HTML), which presents an indirect prompt injection surface.
Ingestion points: Processes data from user-provided documents like document.pdf, document.docx, and document.html.
Boundary markers: No specific delimiters or "ignore embedded instructions" warnings are shown when processing extracted text in the provided instructions.
Capability inventory: The skill can execute subprocesses (pip install, file) and perform file system read/write operations.
Sanitization: No explicit sanitization, escaping, or validation of the extracted external content is demonstrated before it is output or potentially fed back into the agent context.

data-extractor