e2e-test
Warn
Audited by Gen Agent Trust Hub on Feb 26, 2026
Risk Level: MEDIUMEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONDATA_EXFILTRATIONSAFEPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: The skill automatically installs the
agent-browserpackage globally usingnpm install -g agent-browser. It also executesagent-browser install --with-deps, which downloads system-level dependencies (Chromium) for browser automation. These resources are associated with Vercel, a well-known and trusted organization. - [DYNAMIC_EXECUTION]: In Phase 4b, the skill is instructed to 'write a small ad hoc script in the application's language, run it, then delete it' for database validation. This represents dynamic code generation and execution based on the agent's interpretation of the codebase.
- [COMMAND_EXECUTION]: The skill executes several powerful system commands including
npm run devin the background,psqlfor Postgres interactions, andsqlite3for SQLite databases. It also uses theagent-browserCLI to interact with the local environment. - [DATA_EXFILTRATION]: The skill accesses sensitive information such as database schemas and connection strings (via
.env.example). While primarily used for local validation, the ability to read and query the database is a high-privilege operation. - [PROMPT_INJECTION]: The skill uses sub-agents (Sub-agent 1, 2, and 3) to research the codebase for 'logic errors', 'security concerns', and 'user journeys'. This creates a surface for indirect prompt injection if the codebase being analyzed contains malicious instructions designed to influence the agent's testing logic.
- Ingestion points: Sub-agents read the entire codebase, including configuration files, source code, and READMEs.
- Boundary markers: Absent; there are no specific delimiters or instructions to ignore embedded commands within the analyzed code.
- Capability inventory: The skill can install software, run background processes, execute database queries, and generate/run ad-hoc scripts.
- Sanitization: Absent; the sub-agent outputs are used directly to create the task list and execution steps.
Audit Metadata