macos-computer-use
macOS Computer Use
Use this skill when you need to control a macOS app by looking at screenshots and sending real input events like a user.
Before You Start
- If something does not work, ask the user to check the main app that is running the agent.
- Do not guess that the main app is a terminal.
- The main app might be Cursor, an IDE, a plugin host, a desktop app, or something else.
- That main app needs Accessibility permission for mouse moves, clicks, typing, and scrolling.
- That same app may also need Screen & System Audio Recording permission for screenshots.
Simple rule:
- If clicking or typing fails, ask the user to check Accessibility for the main app.
- If screenshots fail, ask the user to check Screen & System Audio Recording for the main app.
Main Idea
Do not trust a normal full-screen screenshot first.
Why:
- A full-screen screenshot only shows what is visible right now.
- The app you need might be on another monitor or another Space.
- A real window screenshot can still work even when the normal screen screenshot is misleading.
Fast Workflow
- Find the right window id.
- See
references/examples.md.
- See
- Bring that window to the front.
- Do this before any click, typing, or scroll.
- Take a screenshot of that exact window.
- See
references/examples.md.
- See
- Find the target spot.
- Use the window screenshot.
- Think like a user: look at the window, find the thing, click it.
- Send the input.
- Click, type, scroll, or drag.
- Check what changed.
- Take another window screenshot or read app state again.
What To Read
- Read
references/basics.mdfirst for the simple rules. - Read
references/examples.mdfor copyable command examples. - Read
references/apis.mdif you need the Apple API background.
Important Rules
- Prefer a real window screenshot over a full-display screenshot.
- Bring the target window to the front before every click, type, or scroll.
- Use small steps when scrolling.
- For text, paste is usually safer than fake letter-by-letter typing.
- Verify each important step with another screenshot.
- Act like a user. Do not inspect app internals unless the user explicitly asks for that.
Common Problems
- If
screencapturesays it cannot create an image from the display, ask the user to check screenshot permission for the main app. - If a click or key press does nothing, ask the user to check Accessibility permission for the main app.
- If the screen screenshot shows the wrong thing, switch to a window-specific screenshot.
- If the target window is not frontmost, bring it to the front and then try again.
- If one click is risky, take another screenshot, then click the next thing.
More from robinebers/skills
shepherd
Shepherd a GitHub pull request all the way to merge-ready by relentlessly polling status and only acting once all automatic reviewers have finished. NEVER merges without explicit human approval. Use when the user says things like "shepherd this PR", "babysit this PR", "get this PR merge-ready", "wait for Cubic", "wait for Bugbot", or asks to drive a PR through review.
12code-auditor
Audit codebases for duplicate code, unused code, DRY violations, dependency bloat, and refactoring opportunities. Use when the user asks for a code audit, technical debt review, dead-code cleanup, dependency audit, consolidation plan, or maintainability review after major feature work or before refactoring.
8code-upgrade
Engineering-discipline toolkit for non-technical users working with AI coders. Wields KISS, DRY, YAGNI, fail-fast, and idempotency as commands. Use when the user asks to audit, simplify, clean up, dedupe, or harden code; or says "make this simpler", "any duplicates?", "is this safe to run twice", "explain this app", "find dead code", "simplify the plan", or "find silent failures".
7setup-process
Generate setup scripts/configs for AI agent worktrees and isolated environments across Cursor, Codex, and Conductor. Use when wiring up a project so AI agents start with the same dependencies, env files, and tool configs as the main repo.
5agents-md
Install or update Robin Ebers's AGENTS.md from github.com/robinebers/agents.md. Use when the user asks to add AGENTS.md to a workspace, refresh it to the latest upstream version, merge upstream changes into a customized local AGENTS.md, or make the repo follow Robin's agent protocol file.
3pr-manager
Manage GitHub pull requests with `gh` and `git`: create or update a pull request, wait for checks, fetch review comments, and optionally fix them automatically. Use when the user says things like "create pr", "open a pr", "update the pr", "pull comments", "check PR comments", "check for new comments", or "fetch review comments".
3