computer-use-remote
Computer Use Remote
When to Use
Load this skill before using computer_use_remote for local desktop and native UI tasks on the connected machine.
If the task is browser-only and the user is flexible, prefer direct browser tooling because it is usually more reliable and token-efficient than screenshot-driven desktop control.
Core Loop
- Call
start_sessionfirst. - Decide from the latest screenshot, not from memory.
- Interactive actions (
move,click,scroll,key,type) already attach a fresh screenshot after they run. - Use
statusfor state without starting a session. - Use
captureonly when you need another screenshot without taking an action.
Operating Rules
- Only the latest screenshot or a definitive tool result counts as evidence.
- The current API uses normalized global screen coordinates; do not assume window ids, element indexes, background-safe input, or semantic click targets unless the runtime explicitly advertises them.
- Prefer accessibility and semantic UI paths first: shortcuts, command palettes, menu accelerators, address/search bars, focus traversal, and other keyboard-accessible controls.
- Prefer
keyandtypeover pointer actions whenever a reliable keyboard path exists. - When a menu or popup is open, treat it as the active UI and prefer keyboard navigation over clicking small transient rows by coordinate.
- If a click dismisses a menu or popup without producing the expected next UI, treat that attempt as failed.
- If the same approach has already failed twice without visible progress, switch strategy instead of repeating it.
- Do not infer focus or task completion from chat logs, sidebars, tool summaries, or status text.
- For browser-navigation tasks done through this tool, only claim success if the browser content area visibly shows the destination page or result.
- If the attached screenshot appears unchanged after a state-changing action, use one explicit
captureto verify before repeating the same action. - Use
type(..., submit=true)only for URL or navigation-style entry where Enter should fire immediately after typing. - Do not use
submit=truefor ordinary text fields. Type first, then sendenterseparately if needed.
Pointer And Scrolling
- Try keyboard scrolling first:
page_down,page_up,space,shift+space, arrows,home, orend. - Use
scrollwhen the desired pane is already active or keyboard scrolling cannot target it. - Treat
moveandclickas last-resort actions for controls that cannot be reached through keyboard, accessibility, browser, or app-native tooling. - Before clicking, make sure the latest screenshot makes the target unambiguous. Use one deliberate click, then reassess from the fresh screenshot.
Control Signals
- Treat user interventions as high-priority control signals.
- If the user says
stop,pause,abort,hold,don't continue, or equivalent, halt immediately and do not use computer-use tools again until the user explicitly resumes.
More from agent0ai/agent-zero
a0-debug-plugin
Diagnose and fix Agent Zero plugin problems. Covers plugin not appearing, won't enable, API endpoints not responding, frontend store errors, extension point injection, settings resolution, hooks.py issues, and log inspection. Use when a plugin is not working, not loading, crashing, missing from the list, or behaving unexpectedly.
5a0-browser-ext
Create, inspect, install, and safely maintain Chrome extensions for Agent Zero's built-in Browser plugin.
4a0-review-plugin
Full audit of Agent Zero plugins in usr/plugins/. Reviews manifest validity, directory structure, code patterns (Store Gating, notifications, imports), security, and duplicate detection against the community index. Use when asked to review, audit, validate, or check an existing plugin before using or contributing it.
4a0-create-agent
Create a new Agent Zero agent profile (subordinate). Covers where profiles live (user / plugin-distributed / project-scoped), the agent.yaml schema, the prompt inheritance & override model, and optional profile-specific tools and extensions. Use for any "create/add/new agent profile" request.
4