webapp-testing

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION] (HIGH): The file scripts/with_server.py uses subprocess.Popen with shell=True to execute arbitrary strings provided via the --server argument. It also uses subprocess.run to execute the final command payload. This allows for arbitrary command execution on the host system.
  • [PROMPT_INJECTION] (HIGH): The SKILL.md file contains instructions that actively discourage the agent from reading the source code of the provided scripts ('DO NOT read the source until you try running the script first... They exist to be called directly as black-box scripts'). This is a deceptive pattern designed to prevent the AI from auditing the commands it is instructed to execute.
  • [INDIRECT_PROMPT_INJECTION] (HIGH): As a web testing toolkit, the 'Reconnaissance-Then-Action' pattern (Cat 8) described in SKILL.md involves the agent navigating to potentially untrusted local or remote URLs, capturing the DOM content (page.content()), and taking screenshots. This untrusted data enters the agent's context and can contain malicious instructions that the agent may inadvertently follow, leading to data exfiltration or further unauthorized command execution.
  • [DATA_EXFILTRATION] (MEDIUM): The example examples/static_html_automation.py demonstrates the use of file:// URLs with Playwright. This capability allows the skill to read sensitive local files, which could be exfiltrated if the agent is subsequently directed to a malicious external site or forced to output the data.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 01:34 PM