Playwright Browser Automation

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: CRITICALREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
  • [REMOTE_CODE_EXECUTION] (CRITICAL): The file run.js uses eval() to execute arbitrary code strings passed as command-line arguments. This allows any agent using this skill to execute arbitrary JavaScript in the host environment without any sandboxing or validation.
  • [COMMAND_EXECUTION] (HIGH): The file run.js allows loading and executing arbitrary local files via require(absolutePath) where the path is provided by the user/agent. This can be used to execute malicious scripts previously written to the filesystem.
  • [DATA_EXFILTRATION] (MEDIUM): The detectDevServers function in lib/helpers.js performs port scanning on 127.0.0.1 for common development ports (3000, 8080, etc.). This provides a reconnaissance capability to map services running on the local host.
  • [DATA_EXFILTRATION] (LOW): The takeScreenshot function in lib/helpers.js saves browser screenshots to /tmp/. While this is a functional feature, writing sensitive data to a shared temporary directory can lead to data exposure.
  • [INDIRECT_PROMPT_INJECTION] (HIGH): Mandatory Evidence Chain for run.js and lib/helpers.js:
  • Ingestion points: The skill ingests data from external websites via Playwright and accepts raw code strings via process.argv.
  • Boundary markers: None. There are no delimiters or instructions to ignore embedded commands in processed web content.
  • Capability inventory: Full browser control (Playwright), local filesystem access (fs), network socket access (net), and arbitrary JS execution (eval).
  • Sanitization: None. External data and input strings are processed and executed without any escaping or validation.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
CRITICAL
Analyzed
Feb 16, 2026, 12:21 PM