content-extract
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [Indirect Prompt Injection] (HIGH): The skill's primary function is to ingest untrusted data from external URLs and convert it into Markdown for the agent to process.
- Ingestion points: The
--urlargument inscripts/content_extract.pyand subsequent fetching of web content. - Boundary markers: Absent. The extracted content is returned as a raw string in a JSON field without delimiters or instructions to the agent to treat it as untrusted data.
- Capability inventory: The agent uses this skill to inform its reasoning and subsequent actions based on the content of external websites.
- Sanitization: No sanitization or filtering is performed on the extracted Markdown content to prevent embedded instructions from being interpreted by the LLM.
- [Dynamic Execution] (MEDIUM): The script
scripts/content_extract.pydynamically determines the path of the executable script it runs. - Evidence: The
_find_mineru_wrapperfunction checks theMINERU_WRAPPER_PATHenvironment variable. If an attacker can influence the environment, they can redirect the subprocess call to an arbitrary executable. - [Command Execution] (LOW): The skill uses
subprocess.runto execute local scripts. - Evidence:
subprocess.run(cmd, ...)is used inscripts/content_extract.py. While it uses a list format (preventing shell injection), it relies on the presence and integrity of a sibling skill (mineru-extract).
Recommendations
- AI detected serious security threats
Audit Metadata