The Agent Skills Directory

[COMMAND_EXECUTION] (HIGH): The skill defines a Bash tool (BetaToolBash20241022) that allows the agent to execute arbitrary shell commands. This is a high-privilege capability that can be abused to modify the environment or interact with network resources.
[REMOTE_CODE_EXECUTION] (HIGH): By combining vision capabilities with a shell tool, the agent can be instructed (via indirect injection) to download and execute malicious payloads from the internet. The Dockerfile provides some isolation, but the capability remains a significant risk.
[DATA_EXFILTRATION] (MEDIUM): The core functionality involves capturing screenshots (pyautogui.screenshot, scrot) which may contain sensitive user data, credentials, or private information displayed on the screen. The skill processes this data as base64 strings.
[PROMPT_INJECTION] (HIGH): (Category 8: Indirect Prompt Injection) The skill is highly vulnerable to instructions embedded in the data it processes.
Ingestion points: Screen captures (screenshots) of potentially untrusted content like web pages or documents (SKILL.md, capture_screenshot method).
Boundary markers: None identified. There are no delimiters or explicit instructions to ignore commands found within screenshots.
Capability inventory: Full GUI control (click, type, scroll) and arbitrary shell command execution (subprocess.run, BetaToolBash20241022).
Sanitization: None. The vision model directly interprets the pixels/text from the captured screen as context for its next action.

computer-use-agents