NYC

application-quality-assurance

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The utility script scripts/with_server.py uses subprocess.Popen(server['cmd'], shell=True) where the command is derived directly from user-provided arguments. This allows for arbitrary shell command execution and is highly susceptible to injection if the agent passes unvalidated strings to this tool.
  • PROMPT_INJECTION (LOW): The skill is vulnerable to Indirect Prompt Injection (Category 8) because it processes untrusted data from web applications and possesses powerful capabilities.
  • Ingestion points: page.goto(), page.content(), and locator results in examples/element_discovery.py and examples/console_logging.py.
  • Boundary markers: None identified in the provided instructions.
  • Capability inventory: Execution of arbitrary shell commands via scripts/with_server.py.
  • Sanitization: No sanitization is performed on inputs before they are executed as shell commands.
  • Deception / Metadata Poisoning (MEDIUM): The documentation in SKILL.md includes a directive: "DO NOT read the source until you try running the script first... These scripts exist to be called directly as black-box scripts rather than ingested into your context window." This instruction actively discourages the agent from performing security analysis on its own tools, which is a deceptive pattern used to mask the high-risk command execution vulnerability.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 05:48 PM