web-tests

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: CRITICALREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [REMOTE_CODE_EXECUTION] (CRITICAL): The skill implements a 'Universal Executor' pattern that reads arbitrary code from external sources and executes it.
  • Ingestion Points: In run.js, the function getCodeToExecute() reads input from process.argv (CLI arguments) and fs.readFileSync(0) (stdin).
  • Execution Sink: The main() function writes this raw input to a temporary file (e.g., .temp-execution-1712345678.js) and executes it using the Node.js require() function.
  • Sanitization: None. The script only wraps the code in a basic async wrapper if missing, providing zero protection against malicious logic.
  • Capability Inventory: The executed code has full access to the Node.js runtime, including the child_process module for shell access and the fs module for file system modification.
  • [COMMAND_EXECUTION] (HIGH): The skill executes shell commands during its initialization phase.
  • Evidence: The installPlaywright() function in run.js uses execSync to run npm install and npx playwright install chromium. While these are functional, they provide a vector for command injection if environment variables or paths were manipulated.
  • [EXTERNAL_DOWNLOADS] (MEDIUM): The skill's setup routine downloads external binaries and packages at runtime.
  • Evidence: package.json depends on playwright, and run.js triggers the download of the Chromium browser binary via npx playwright install. While Playwright is a reputable package, the combination with the RCE vulnerability makes this a high-risk vector for downloading and executing malicious tools.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
CRITICAL
Analyzed
Feb 16, 2026, 09:56 AM