docx-processing-anthropic
Warn
Audited by Gen Agent Trust Hub on Mar 5, 2026
Risk Level: MEDIUMPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [PROMPT_INJECTION]: Misleading author metadata. The SKILL.md YAML frontmatter identifies Anthropic as the author, which conflicts with the identified author lawvable. This discrepancy falls under Category 7 (Metadata Poisoning) as it may cause misjudgment of the skill's provenance.
- [PROMPT_INJECTION]: Indirect prompt injection attack surface (Category 8). The skill is specifically designed to ingest and process untrusted .docx files from external sources.
- Ingestion points: office files are processed via ooxml/scripts/unpack.py and pandoc for text extraction.
- Boundary markers: No explicit delimiters or instructions are provided to the agent to ignore instructions embedded within the extracted document content.
- Capability inventory: Subprocess calls to soffice, pdftoppm, and git, as well as file system writing capabilities across scripts.
- Sanitization: Partial. While core scripts use defusedxml to prevent XXE, validation scripts (ooxml/scripts/validation/redlining.py and docx.py) use standard xml.etree.ElementTree and lxml parsers.
- [COMMAND_EXECUTION]: Local command execution for document conversion and validation. The skill utilizes subprocess.run to invoke system utilities including soffice (LibreOffice), git, and pdftoppm.
- Evidence: pack.py and redlining.py. These commands are implemented using list-based arguments, which is a secure practice that reduces the risk of shell injection.
- [EXTERNAL_DOWNLOADS]: Installation of well-known dependencies. SKILL.md documents the installation of pandoc, the docx npm package, and defusedxml. These are standard tools for the skill's intended purpose and are documented neutrally.
Audit Metadata