ios-device-automation
Pass
Audited by Gen Agent Trust Hub on Apr 1, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: The skill uses
npxto fetch and execute the@midscene/iospackage from the npm registry. It also provides links to external documentation and model configuration resources. - [REMOTE_CODE_EXECUTION]: The use of
npx @midscene/ios@1represents the execution of remote code downloaded at runtime. While these originate from the skill's own vendor, this mechanism allows for the execution of arbitrary logic retrieved from the npm registry. - [COMMAND_EXECUTION]: The skill relies on the Bash tool to execute CLI commands for connecting to devices, interacting with WebDriverAgent, and running the Midscene automation engine.
- [DATA_EXFILTRATION]: The skill captures and processes screenshots of the iOS device. Depending on the applications being automated, these screenshots can contain highly sensitive information, including Personal Identifiable Information (PII), private messages, or credentials.
- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection (Category 8) because it uses vision-based AI to interpret and act upon screen content.
- Ingestion points: Screen content captured via
take_screenshotand theactcommand inSKILL.mdserves as input for the agent's decision-making process. - Boundary markers: None identified. There are no instructions or delimiters designed to help the agent distinguish between intended instructions and text visible on the device screen.
- Capability inventory: The
acttool provides high-privilege capabilities, including tapping, typing, scrolling, dragging, and system navigation (Home, App Switcher). - Sanitization: There is no evidence of sanitization or filtering of the visual data before it is processed by the AI model, allowing an attacker to potentially control the agent via text or elements displayed within an app or website.
Audit Metadata