e2e-test

Pass

Audited by Gen Agent Trust Hub on Mar 8, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill automatically installs the 'agent-browser' package globally using 'npm install -g' and downloads browser-related system dependencies. While attributed to Vercel in the description, the package source is not restricted to a specific trusted registry URL.
  • [COMMAND_EXECUTION]: The skill executes multiple shell commands to check the operating system, manage background processes ('npm run dev &'), and interact with databases using command-line tools like 'psql' and 'sqlite3'.
  • [REMOTE_CODE_EXECUTION]: For databases other than Postgres or SQLite, the instructions direct the agent to 'write a small ad hoc script in the application's language, run it, then delete it', which constitutes the generation and execution of dynamic code at runtime.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect injection as it analyzes and acts upon untrusted codebase content. * Ingestion points: Codebase structure, database schemas, and application logic are researched by three parallel sub-agents. * Boundary markers: There are no explicit delimiters or specific instructions provided to the sub-agents to ignore or isolate potentially malicious instructions embedded in the code or comments of the application being tested. * Capability inventory: The agent can execute shell commands, perform database queries, and directly modify source files ('Fix the code — make the correction directly'). * Sanitization: The skill does not perform sanitization or validation of the code content before using it to derive testing steps or logic.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 8, 2026, 06:47 PM