dev-browser

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCREDENTIALS_UNSAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The skill makes extensive use of execSync in scripts/start-server.ts to manage processes and install software. More critically, SKILL.md instructs the agent to execute arbitrary TypeScript code by piping heredocs into npx tsx, which bypasses standard script constraints.
  • CREDENTIALS_UNSAFE (HIGH): The 'Extension Mode' described in SKILL.md and the scraping guide in references/scraping.md explicitly target the user's authenticated browser sessions. The skill provides instructions for capturing and replaying authentication headers (e.g., cookies, bearer tokens) from intercepted network requests.
  • DATA_EXFILTRATION (HIGH): By connecting to a user's active browser session and providing methods to capture auth headers and page content, the skill facilitates the exfiltration of sensitive user data to external endpoints under the guise of 'scraping'.
  • EXTERNAL_DOWNLOADS (MEDIUM): The server.sh script and scripts/start-server.ts perform runtime installations of Node.js packages and Playwright browser binaries. While these use standard registries, they occur automatically during setup without integrity verification.
  • INDIRECT_PROMPT_INJECTION (HIGH): The skill lacks boundaries when processing external web data. It ingests untrusted HTML and API responses which are then used to drive automated actions (clicking, typing, file writing), creating a high-risk vector for adversarial instructions embedded in web pages to take control of the agent's browser session.
  • Ingestion points: page.goto(), getAISnapshot(), and page.evaluate() are used to pull content from arbitrary URLs into the agent's context.
  • Boundary markers: None identified; untrusted data is processed directly as source for subsequent actions.
  • Capability inventory: File system access (fs.writeFileSync, screenshot), network access (fetch, page.on('request')), and arbitrary shell command execution.
  • Sanitization: No evidence of sanitization or validation of the ingested web content before it influences agent decision-making.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 12:36 PM