The Agent Skills Directory

[COMMAND_EXECUTION] (HIGH): The script scripts/with_server.py uses subprocess.Popen with shell=True to execute server commands and subprocess.run to execute the final automation command. This allows for arbitrary shell command execution based on strings constructed by the agent.
[INDIRECT_PROMPT_INJECTION] (HIGH): The skill's core purpose involves ingesting untrusted data from web applications.
Ingestion points: Data enters the context via page.content(), page.on("console", ...) (captured in console_logging.py), and page.locator().inner_text() (used in element_discovery.py).
Boundary markers: No boundary markers or sanitization logic are present to separate application content from agent instructions.
Capability inventory: The agent can write and execute arbitrary Python code and run shell commands via with_server.py.
Sanitization: There is no evidence of sanitization for the content read from the web application before it is processed by the agent's logic.
[PROMPT_INJECTION] (MEDIUM): The SKILL.md file explicitly instructs the agent: 'DO NOT read the source until you try running the script first'. This encourages the agent to execute complex logic as a 'black-box' without security inspection, which is a tactic used to bypass review and hide malicious behavior.
[DATA_EXPOSURE] (LOW): The examples (console_logging.py, element_discovery.py) write application data and screenshots to /mnt/user-data/outputs/ and /tmp/. While these are local paths, if the application being tested contains sensitive information, it could be exposed to other processes or subsequent agent sessions.

webapp-testing