doc-to-markdown
Pass
Audited by Gen Agent Trust Hub on Apr 5, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The orchestration scripts
scripts/convert.pyandscripts/validate_output.pyutilizesubprocess.runto execute external command-line utilities such aspandoc,markitdown, andpdftotext. These calls use argument lists which mitigate shell injection risks, although they operate on file paths derived from user input. - [EXTERNAL_DOWNLOADS]: The skill instructions and scripts recommend using
uv run --withorpip installto fetch well-known packages likepymupdf4llm,markitdown, andpython-docxfrom the public PyPI registry. This is consistent with standard development practices for managing dependencies from trusted services. - [PROMPT_INJECTION]: The skill exhibits a surface for indirect prompt injection (Category 8) because it converts untrusted document files (PDF, DOCX) into Markdown content that the agent then processes as part of its context.
- Ingestion points:
scripts/convert.pyandscripts/extract_pdf_images.pyingest and process external document files. - Boundary markers: The skill does not implement explicit boundary markers or 'ignore' instructions in the output Markdown to isolate document content from agent instructions.
- Capability inventory: The skill environment has the capability to execute shell commands and trigger subsequent tool chains (e.g.,
/docs-cleaner). - Sanitization: Content is cleaned for formatting artifacts (e.g., fixing CJK spacing, removing Pandoc attributes), but there is no sanitization or filtering for natural language instructions embedded within the source documents.
Audit Metadata