doc-to-markdown

Pass

Audited by Gen Agent Trust Hub on Apr 5, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The orchestration scripts scripts/convert.py and scripts/validate_output.py utilize subprocess.run to execute external command-line utilities such as pandoc, markitdown, and pdftotext. These calls use argument lists which mitigate shell injection risks, although they operate on file paths derived from user input.
  • [EXTERNAL_DOWNLOADS]: The skill instructions and scripts recommend using uv run --with or pip install to fetch well-known packages like pymupdf4llm, markitdown, and python-docx from the public PyPI registry. This is consistent with standard development practices for managing dependencies from trusted services.
  • [PROMPT_INJECTION]: The skill exhibits a surface for indirect prompt injection (Category 8) because it converts untrusted document files (PDF, DOCX) into Markdown content that the agent then processes as part of its context.
  • Ingestion points: scripts/convert.py and scripts/extract_pdf_images.py ingest and process external document files.
  • Boundary markers: The skill does not implement explicit boundary markers or 'ignore' instructions in the output Markdown to isolate document content from agent instructions.
  • Capability inventory: The skill environment has the capability to execute shell commands and trigger subsequent tool chains (e.g., /docs-cleaner).
  • Sanitization: Content is cleaned for formatting artifacts (e.g., fixing CJK spacing, removing Pandoc attributes), but there is no sanitization or filtering for natural language instructions embedded within the source documents.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 5, 2026, 03:21 PM