mineru-pdf-converter

Warn

Audited by Gen Agent Trust Hub on Apr 14, 2026

Risk Level: MEDIUMDATA_EXFILTRATIONEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONCREDENTIALS_UNSAFEREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [DATA_EXFILTRATION]: The skill sends local document content to the MinerU cloud API (mineru.net) for processing. While this is the core functionality, it requires transferring potentially sensitive data to a third-party service.
  • [EXTERNAL_DOWNLOADS]: Conversion results are downloaded as ZIP archives from remote URLs provided by the MinerU API.
  • [CREDENTIALS_UNSAFE]: Users are instructed to store their MinerU API token in a plain-text Markdown file (references/mineru-token.md). Storing credentials in unencrypted local files exposes them to other processes with read access to the user's home directory.
  • [REMOTE_CODE_EXECUTION]: The script scripts/mineru_convert.py uses zipfile.extractall() on the ZIP archive downloaded from the external API. This function is vulnerable to 'Zip Slip' path traversal attacks. If the remote service were compromised and returned a malicious archive containing files with path traversal names (e.g., ../../evil.sh), it could overwrite sensitive files outside the intended output directory.
  • [COMMAND_EXECUTION]: The skill executes local Python scripts to orchestrate the conversion process. These scripts utilize requests for networking and pymupdf for PDF splitting and manipulation.
  • [PROMPT_INJECTION]: The skill processes untrusted external data from PDFs and remote URLs, which may contain malicious instructions designed to influence the agent when the converted Markdown is read.
  • Ingestion points: Input files (--input) and remote URLs (--url) are processed and their contents returned to the agent.
  • Boundary markers: Absent. The converted text is not wrapped in markers to distinguish it from instructions.
  • Capability inventory: The skill has access to Bash, Read, and Write tools, and executes Python scripts.
  • Sanitization: No sanitization or filtering of the output Markdown was detected before it enters the agent's context.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 14, 2026, 03:08 PM