mineru-extract

Fail

Audited by Gen Agent Trust Hub on Feb 15, 2026

Risk Level: HIGHPROMPT_INJECTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATION
Full Analysis
  • [Indirect Prompt Injection] (HIGH): The skill is designed to fetch content from external, attacker-controllable URLs (like WeChat articles or PDFs) and convert them into Markdown for the agent to process.
  • Ingestion points: Untrusted data enters via URLs provided to the file-sources argument in scripts/mineru_parse_documents.py.
  • Boundary markers: There are no specific boundary markers or instructions mentioned to isolate the external content from the agent's instructions.
  • Capability inventory: The parsed Markdown is returned to the agent, which can then be used to influence downstream actions, especially if the agent has tool-execution capabilities.
  • Sanitization: The documentation does not mention any sanitization or filtering of the extracted content before it is presented to the agent.
  • [External Downloads] (MEDIUM): The skill downloads a result.zip from the MinerU API and automatically extracts its contents to the local workspace.
  • Evidence: SKILL.md states: 'The script always downloads + extracts the MinerU result zip to: /home/node/.openclaw/workspace/mineru/<task_id>/'.
  • Risk: Without seeing the extraction logic in scripts/mineru_extract.py, there is a risk of Path Traversal (Zip Slip) if the archive contains malicious file paths targeting locations outside the intended directory.
  • [Data Exfiltration] (LOW): The skill sends user-provided URLs and the content of those URLs to the MinerU third-party service (https://mineru.net) for processing.
  • Evidence: The configuration requires a MINERU_TOKEN and sets MINERU_API_BASE=https://mineru.net.
  • Risk: Sensitive information contained within the target URLs or the user's parsing requests is exposed to the service provider.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 15, 2026, 01:26 PM