phone-agent
SKILL.md
AutoGLM Phone Agent Skill
This skill lets Codex drive an Android device through the AutoGLM Phone Agent SDK: tap, type, swipe, scroll, launch apps, take screenshots, and read UI text. It is aimed at automation tasks such as end-to-end testing, data collection, or reproducing user journeys.
Prerequisites
- An Android device or emulator with developer mode and USB debugging enabled.
adbavailable in the path and the device showing up inadb devices.- AutoGLM Phone Agent SDK installed (see upstream docs: https://github.com/zai-org/Open-AutoGLM).
- A running Phone Agent backend (start the agent service provided by the SDK before using the skill).
Setup
- Connect the device and verify connectivity:
adb devicesshould list at least one device asdevice. - Follow the SDK guide to start the Phone Agent service (typically binds to a host/port on your machine). Note the service URL.
- Expose the service URL to the agent runtime, for example by setting
PHONE_AGENT_ENDPOINT=http://127.0.0.1:5000(adapt to your actual host/port). - Grant the device the needed permissions (overlay/accessibility) when prompted by the SDK so that taps and text entry succeed.
How to Use
- Describe high-level goals; the agent decomposes them into UI steps.
- Include app names or on-screen text to anchor actions (e.g., "open Settings, search for 'Wi‑Fi', toggle it off").
- Ask for confirmation screenshots when changes are risky.
Example prompts the skill handles well:
- "Open the Play Store, search for 'Signal', and share the first result link back."
- "In the Twitter app, open settings → Privacy and turn off location precision, then send me a screenshot of the toggle state."
- "Launch our test app, log in with the provided test account, and capture the purchase confirmation screen."
Outputs
- Action logs (tap/swipe/type), screenshots, and structured observations returned by the SDK.
- Errors from the backend are surfaced directly so you can troubleshoot quickly.
Troubleshooting
- If commands hang, confirm the Phone Agent service is reachable at
PHONE_AGENT_ENDPOINTand that the port is not firewalled. - If taps land in the wrong place, recalibrate the device resolution in the SDK or restart the accessibility service.
- If no device is detected, reconnect USB, ensure
adbhas permission, and rerunadb devices.
Safety and Limits
- The skill executes real UI actions—use only on test devices or accounts when possible.
- Avoid tasks that require biometric auth; the SDK cannot bypass hardware prompts.
- Network-dependent steps may vary by region or app version; provide explicit fallbacks when reliability matters.
Changelog
- 1.0.0: Initial publication with setup, usage guidance, and troubleshooting notes for the AutoGLM Phone Agent.
Weekly Installs
6
Repository
gaojizhou/skillsGitHub Stars
26
First Seen
12 days ago
Security Audits
Installed on
opencode6
github-copilot6
codex6
kimi-cli6
gemini-cli6
amp6