Desktop Computer Automation
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONDATA_EXFILTRATIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [REMOTE_CODE_EXECUTION] (HIGH): The skill utilizes
npx @midscene/computer@1, which downloads and executes code directly from the npm registry at runtime. This introduces a supply chain risk where a compromised package or registry could lead to arbitrary code execution on the host system. - [COMMAND_EXECUTION] (HIGH): The skill requires the
Bashtool and uses it to invoke complex system-level interactions. It encourages modifying the systemPATHand potentially bypassing security prompts (like Accessibility permissions) to allow the agent full control over the operating system's UI and shell. - [DATA_EXFILTRATION] (MEDIUM): The
take_screenshotcommand captures the entire desktop. This visual data, which may contain credentials, private documents, or session tokens, is sent to external AI model providers (Google, OpenRouter, Volcengine) for processing. While necessary for the skill's function, it constitutes a high-risk data exposure path. - [PROMPT_INJECTION] (LOW): The skill is vulnerable to Indirect Prompt Injection. Because the agent 'sees' the screen to decide its next action, malicious text on a website or in a document could manipulate the agent's behavior.
- Ingestion points: Screen state captured via
take_screenshotand interpreted byactcommands. - Boundary markers: None. There are no instructions to the vision model to ignore text-based commands found within images.
- Capability inventory: Full GUI control (click, type), file system access via
Bash, and network access via CLI tools. - Sanitization: None. Raw visual input is passed directly to the AI inference engine.
Recommendations
- AI detected serious security threats
Audit Metadata