docx-processing-anthropic

Warn

Audited by Gen Agent Trust Hub on Mar 5, 2026

Risk Level: MEDIUMPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION]: Misleading author metadata. The SKILL.md YAML frontmatter identifies Anthropic as the author, which conflicts with the identified author lawvable. This discrepancy falls under Category 7 (Metadata Poisoning) as it may cause misjudgment of the skill's provenance.
  • [PROMPT_INJECTION]: Indirect prompt injection attack surface (Category 8). The skill is specifically designed to ingest and process untrusted .docx files from external sources.
  • Ingestion points: office files are processed via ooxml/scripts/unpack.py and pandoc for text extraction.
  • Boundary markers: No explicit delimiters or instructions are provided to the agent to ignore instructions embedded within the extracted document content.
  • Capability inventory: Subprocess calls to soffice, pdftoppm, and git, as well as file system writing capabilities across scripts.
  • Sanitization: Partial. While core scripts use defusedxml to prevent XXE, validation scripts (ooxml/scripts/validation/redlining.py and docx.py) use standard xml.etree.ElementTree and lxml parsers.
  • [COMMAND_EXECUTION]: Local command execution for document conversion and validation. The skill utilizes subprocess.run to invoke system utilities including soffice (LibreOffice), git, and pdftoppm.
  • Evidence: pack.py and redlining.py. These commands are implemented using list-based arguments, which is a secure practice that reduces the risk of shell injection.
  • [EXTERNAL_DOWNLOADS]: Installation of well-known dependencies. SKILL.md documents the installation of pandoc, the docx npm package, and defusedxml. These are standard tools for the skill's intended purpose and are documented neutrally.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 5, 2026, 06:55 AM