research-agent
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [Indirect Prompt Injection] (HIGH): The skill processes external content with high-privilege capabilities.
- Ingestion points: The agent ingests untrusted data from the
Research questionandContextinputs, and more critically, from external web content retrieved viaFirecrawl,Perplexity, and library documentation viaNia. - Boundary markers: None present. The agent processes raw external text without delimiters or instructions to ignore embedded commands.
- Capability inventory: The skill executes shell commands via
uv run pythonand performs file write operations to thehandoff directory. - Sanitization: No sanitization or escaping of external content is documented before it is processed or written to findings.
- [Command Execution] (MEDIUM): The skill executes shell commands using
uv run pythonwhere user-controlled or externally-sourced strings (query, library, URL) are interpolated as arguments. Without rigorous shell escaping, this could lead to command injection if the agent is manipulated into including shell metacharacters in its tool calls.
Recommendations
- AI detected serious security threats
Audit Metadata