docx-official
Warn
Audited by Gen Agent Trust Hub on Feb 28, 2026
Risk Level: MEDIUMPROMPT_INJECTIONCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
- [PROMPT_INJECTION]: The skill processes untrusted user-provided .docx files and presents extracted content to the agent, creating an Indirect Prompt Injection surface. Malicious documents could contain instructions intended to hijack agent behavior or extract sensitive information. Mandatory evidence chain: 1. Ingestion points: ooxml/scripts/unpack.py and the pandoc markdown conversion workflow. 2. Boundary markers: Absent; text is extracted and presented without explicit delimiters. 3. Capability inventory: Execution of subprocesses (soffice, pandoc, git, pdftoppm) and file system read/write access. 4. Sanitization: defusedxml is used for some XML parsing, but extracted natural language content is not sanitized before being processed by the agent.
- [DATA_EXFILTRATION]: Validation scripts in the ooxml/scripts/validation/ directory (base.py, docx.py, pptx.py) utilize the lxml library to parse XML files from unpacked documents without disabling external entity resolution. This creates a potential XML External Entity (XXE) vulnerability that could be leveraged to read sensitive local files if a malicious document is processed. While the main document library correctly uses defusedxml, the validation logic remains an exposure point.
- [COMMAND_EXECUTION]: The skill invokes external system binaries such as LibreOffice (soffice), pandoc, and git via subprocess.run to perform document conversions and comparisons. Although these operations are core to the skill's functionality, executing these tools on untrusted user data presents a security risk if the tools themselves contain unpatched vulnerabilities.
Audit Metadata