NYC

Android Device Automation

Fail

Audited by Gen Agent Trust Hub on Feb 15, 2026

Risk Level: HIGHEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [Remote Code Execution / External Downloads] (HIGH): The skill utilizes npx @midscene/android@1 to download and execute code from the npm registry during execution. This introduces a significant supply chain risk as the dependency is not pinned to a specific hash and originates from an untrusted source outside the defined scope of verified providers.
  • [Indirect Prompt Injection] (HIGH): The skill possesses a high-risk profile for indirect prompt injection due to its core operational loop. 1. Ingestion points: Visual screenshots of the Android device screen are processed by an LLM via Midscene. 2. Boundary markers: There are no instructions or delimiters defined to separate the agent's task from untrusted text or elements found within the screenshots. 3. Capability inventory: The skill has the ability to perform UI actions (taps, swipes, text input) and interact with the device via ADB. 4. Sanitization: There is no evidence of sanitization or validation of the screen content before it influences the LLM's next action. An attacker could display malicious instructions on the device screen to hijack the automation workflow.
  • [Command Execution] (MEDIUM): The skill requires the Bash tool to execute npx and ADB commands. While functional, the lack of restriction on the commands passed to the shell increases the impact of any successful prompt injection attack.
  • [Credentials Unsafe] (LOW): The skill requires the configuration of MIDSCENE_MODEL_API_KEY. While necessary for the LLM-driven vision features, it creates a surface for potential credential exposure if environment variables are not managed securely within the host environment.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 15, 2026, 10:54 PM