NYC

webapp-testing

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The script scripts/with_server.py uses subprocess.Popen with shell=True to execute strings provided via the --server command-line argument. This allows for arbitrary command execution and is highly vulnerable to injection.
  • PROMPT_INJECTION (MEDIUM): The SKILL.md file includes instructions designed to prevent security review, specifically telling the agent to treat scripts as black boxes and 'DO NOT read the source'. This is a deceptive pattern intended to hide behavior from the AI's internal reasoning.
  • PROMPT_INJECTION (LOW): The skill is vulnerable to indirect prompt injection due to its use of Playwright to ingest untrusted web data. Evidence: 1. Ingestion points: page.content() and page.locator() in SKILL.md and examples/element_discovery.py. 2. Boundary markers: None present. 3. Capability inventory: Shell command execution via scripts/with_server.py. 4. Sanitization: None present.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:23 PM