research-agent

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [Indirect Prompt Injection] (HIGH): The skill processes external content with high-privilege capabilities.
  • Ingestion points: The agent ingests untrusted data from the Research question and Context inputs, and more critically, from external web content retrieved via Firecrawl, Perplexity, and library documentation via Nia.
  • Boundary markers: None present. The agent processes raw external text without delimiters or instructions to ignore embedded commands.
  • Capability inventory: The skill executes shell commands via uv run python and performs file write operations to the handoff directory.
  • Sanitization: No sanitization or escaping of external content is documented before it is processed or written to findings.
  • [Command Execution] (MEDIUM): The skill executes shell commands using uv run python where user-controlled or externally-sourced strings (query, library, URL) are interpolated as arguments. Without rigorous shell escaping, this could lead to command injection if the agent is manipulated into including shell metacharacters in its tool calls.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 08:32 AM