data-extractor
Pass
Audited by Gen Agent Trust Hub on Mar 13, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: Installs several standard Python libraries for document processing (pdfplumber, python-docx, beautifulsoup4, lxml, openpyxl, pandas, pytesseract, Pillow). These are widely used and appropriate for the skill's purpose.
- [COMMAND_EXECUTION]: Uses the system
filecommand to identify document formats. This is a common and safe utility for file type detection. It also executes pip install to manage dependencies. - [PROMPT_INJECTION]: The skill processes untrusted external content from various document formats (PDF, DOCX, HTML), which presents an indirect prompt injection surface.
- Ingestion points: Processes data from user-provided documents like document.pdf, document.docx, and document.html.
- Boundary markers: No specific delimiters or "ignore embedded instructions" warnings are shown when processing extracted text in the provided instructions.
- Capability inventory: The skill can execute subprocesses (pip install, file) and perform file system read/write operations.
- Sanitization: No explicit sanitization, escaping, or validation of the extracted external content is demonstrated before it is output or potentially fed back into the agent context.
Audit Metadata