mineru-extract
Fail
Audited by Gen Agent Trust Hub on Feb 15, 2026
Risk Level: HIGHPROMPT_INJECTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATION
Full Analysis
- [Indirect Prompt Injection] (HIGH): The skill is designed to fetch content from external, attacker-controllable URLs (like WeChat articles or PDFs) and convert them into Markdown for the agent to process.
- Ingestion points: Untrusted data enters via URLs provided to the
file-sourcesargument inscripts/mineru_parse_documents.py. - Boundary markers: There are no specific boundary markers or instructions mentioned to isolate the external content from the agent's instructions.
- Capability inventory: The parsed Markdown is returned to the agent, which can then be used to influence downstream actions, especially if the agent has tool-execution capabilities.
- Sanitization: The documentation does not mention any sanitization or filtering of the extracted content before it is presented to the agent.
- [External Downloads] (MEDIUM): The skill downloads a
result.zipfrom the MinerU API and automatically extracts its contents to the local workspace. - Evidence:
SKILL.mdstates: 'The script always downloads + extracts the MinerU result zip to: /home/node/.openclaw/workspace/mineru/<task_id>/'. - Risk: Without seeing the extraction logic in
scripts/mineru_extract.py, there is a risk of Path Traversal (Zip Slip) if the archive contains malicious file paths targeting locations outside the intended directory. - [Data Exfiltration] (LOW): The skill sends user-provided URLs and the content of those URLs to the MinerU third-party service (
https://mineru.net) for processing. - Evidence: The configuration requires a
MINERU_TOKENand setsMINERU_API_BASE=https://mineru.net. - Risk: Sensitive information contained within the target URLs or the user's parsing requests is exposed to the service provider.
Recommendations
- AI detected serious security threats
Audit Metadata