deep-research
Warn
Audited by Gen Agent Trust Hub on Apr 8, 2026
Risk Level: MEDIUMEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill exhibits a command injection vulnerability in the way it dispatches background tasks. In Phase 2, the Lead Researcher is instructed to dispatch the
gemini-research.shscript by inlining the research query directly into a Bash command within double quotes. A specially crafted query containing shell metacharacters (e.g., semicolons, backticks, or command substitutions) could allow for the execution of arbitrary commands in the agent's environment. - [EXTERNAL_DOWNLOADS]: The skill dynamically downloads and executes the
chromuxbrowser tool from the vendor's NPM registry usingnpx. It also relies on thegemini-clifrom Google, which is a well-known service. While these are vendor-provided or trusted tools, they involve external code execution at runtime. - [PROMPT_INJECTION]: The skill possesses an indirect prompt injection surface as it ingests large amounts of untrusted data from the web via search results and browser snapshots.
- Ingestion points: Web results from
WebSearch, full page content fromWebFetch, and browser state snapshots fromchromux. - Boundary markers: The instructions do not define any delimiters or provide explicit 'ignore' directives to the subagents regarding the content they fetch.
- Capability inventory: The skill has extensive capabilities, including executing local shell scripts (
Bashtool) and spawning new subagents (Agenttool). - Sanitization: No sanitization or filtering of the fetched web content is performed before it is processed by the agent or subagents.
- [PROMPT_INJECTION]: The skill's 'Autopilot mode' (
--autoflag) is a significant autonomy risk. When enabled, it instructs the agent to 'Skip ALL user confirmations' and run the entire research pipeline end-to-end. This concealment pattern reduces user oversight during critical operations like tool invocation and file writing.
Audit Metadata