data-extractor

Pass

Audited by Gen Agent Trust Hub on Mar 13, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: Installs several standard Python libraries for document processing (pdfplumber, python-docx, beautifulsoup4, lxml, openpyxl, pandas, pytesseract, Pillow). These are widely used and appropriate for the skill's purpose.
  • [COMMAND_EXECUTION]: Uses the system file command to identify document formats. This is a common and safe utility for file type detection. It also executes pip install to manage dependencies.
  • [PROMPT_INJECTION]: The skill processes untrusted external content from various document formats (PDF, DOCX, HTML), which presents an indirect prompt injection surface.
  • Ingestion points: Processes data from user-provided documents like document.pdf, document.docx, and document.html.
  • Boundary markers: No specific delimiters or "ignore embedded instructions" warnings are shown when processing extracted text in the provided instructions.
  • Capability inventory: The skill can execute subprocesses (pip install, file) and perform file system read/write operations.
  • Sanitization: No explicit sanitization, escaping, or validation of the extracted external content is demonstrated before it is output or potentially fed back into the agent context.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 13, 2026, 09:15 PM