screenshot
Screenshot Capture
Follow these save-location rules every time:
- If the user specifies a path, save there.
- If the user asks for a screenshot without a path, save to the OS default screenshot location.
- If Codex needs a screenshot for its own inspection, save to the temp directory.
Tool priority
- Prefer tool-specific screenshot capabilities when available (for example: a Figma MCP/skill for Figma files, or Playwright/agent-browser tools for browsers and Electron apps).
- Use this skill when explicitly asked, for whole-system desktop captures, or when a tool-specific capture cannot get what you need.
- Otherwise, treat this skill as the default for desktop apps without a better-integrated capture tool.
macOS permission preflight (reduce repeated prompts)
On macOS, run the preflight helper once before window/app capture. It checks Screen Recording permission, explains why it is needed, and requests it in one place.
The helpers route Swift's module cache to $TMPDIR/codex-swift-module-cache
to avoid extra sandbox module-cache prompts.
bash <path-to-skill>/scripts/ensure_macos_permissions.sh
To avoid multiple sandbox approval prompts, combine preflight + capture in one command when possible:
bash <path-to-skill>/scripts/ensure_macos_permissions.sh && \
python3 <path-to-skill>/scripts/take_screenshot.py --app "Codex"
For Codex inspection runs, keep the output in temp:
bash <path-to-skill>/scripts/ensure_macos_permissions.sh && \
python3 <path-to-skill>/scripts/take_screenshot.py --app "<App>" --mode temp
Use the bundled scripts to avoid re-deriving OS-specific commands.
macOS and Linux (Python helper)
Run the helper from the repo root:
python3 <path-to-skill>/scripts/take_screenshot.py
Common patterns:
- Default location (user asked for "a screenshot"):
python3 <path-to-skill>/scripts/take_screenshot.py
- Temp location (Codex visual check):
python3 <path-to-skill>/scripts/take_screenshot.py --mode temp
- Explicit location (user provided a path or filename):
python3 <path-to-skill>/scripts/take_screenshot.py --path output/screen.png
- App/window capture by app name (macOS only; substring match is OK; captures all matching windows):
python3 <path-to-skill>/scripts/take_screenshot.py --app "Codex"
- Specific window title within an app (macOS only):
python3 <path-to-skill>/scripts/take_screenshot.py --app "Codex" --window-name "Settings"
- List matching window ids before capturing (macOS only):
python3 <path-to-skill>/scripts/take_screenshot.py --list-windows --app "Codex"
- Pixel region (x,y,w,h):
python3 <path-to-skill>/scripts/take_screenshot.py --mode temp --region 100,200,800,600
- Focused/active window (captures only the frontmost window; use
--appto capture all windows):
python3 <path-to-skill>/scripts/take_screenshot.py --mode temp --active-window
- Specific window id (use --list-windows on macOS to discover ids):
python3 <path-to-skill>/scripts/take_screenshot.py --window-id 12345
The script prints one path per capture. When multiple windows or displays match, it prints multiple paths (one per line) and adds suffixes like -w<windowId> or -d<display>. View each path sequentially with the image viewer tool, and only manipulate images if needed or requested.
Workflow examples
- "Take a look at and tell me what you see": capture to temp, then view each printed path in order.
bash <path-to-skill>/scripts/ensure_macos_permissions.sh && \
python3 <path-to-skill>/scripts/take_screenshot.py --app "<App>" --mode temp
- "The design from Figma is not matching what is implemented": use a Figma MCP/skill to capture the design first, then capture the running app with this skill (typically to temp) and compare the raw screenshots before any manipulation.
Multi-display behavior
- On macOS, full-screen captures save one file per display when multiple monitors are connected.
- On Linux and Windows, full-screen captures use the virtual desktop (all monitors in one image); use
--regionto isolate a single display when needed.
Linux prerequisites and selection logic
The helper automatically selects the first available tool:
scrotgnome-screenshot- ImageMagick
import
If none are available, ask the user to install one of them and retry.
Coordinate regions require scrot or ImageMagick import.
--app, --window-name, and --list-windows are macOS-only. On Linux, use
--active-window or provide --window-id when available.
Windows (PowerShell helper)
Run the PowerShell helper:
powershell -ExecutionPolicy Bypass -File <path-to-skill>/scripts/take_screenshot.ps1
Common patterns:
- Default location:
powershell -ExecutionPolicy Bypass -File <path-to-skill>/scripts/take_screenshot.ps1
- Temp location (Codex visual check):
powershell -ExecutionPolicy Bypass -File <path-to-skill>/scripts/take_screenshot.ps1 -Mode temp
- Explicit path:
powershell -ExecutionPolicy Bypass -File <path-to-skill>/scripts/take_screenshot.ps1 -Path "C:\Temp\screen.png"
- Pixel region (x,y,w,h):
powershell -ExecutionPolicy Bypass -File <path-to-skill>/scripts/take_screenshot.ps1 -Mode temp -Region 100,200,800,600
- Active window (ask the user to focus it first):
powershell -ExecutionPolicy Bypass -File <path-to-skill>/scripts/take_screenshot.ps1 -Mode temp -ActiveWindow
- Specific window handle (only when provided):
powershell -ExecutionPolicy Bypass -File <path-to-skill>/scripts/take_screenshot.ps1 -WindowHandle 123456
Direct OS commands (fallbacks)
Use these when you cannot run the helpers.
macOS
- Full screen to a specific path:
screencapture -x output/screen.png
- Pixel region:
screencapture -x -R100,200,800,600 output/region.png
- Specific window id:
screencapture -x -l12345 output/window.png
- Interactive selection or window pick:
screencapture -x -i output/interactive.png
Linux
- Full screen:
scrot output/screen.png
gnome-screenshot -f output/screen.png
import -window root output/screen.png
- Pixel region:
scrot -a 100,200,800,600 output/region.png
import -window root -crop 800x600+100+200 output/region.png
- Active window:
scrot -u output/window.png
gnome-screenshot -w -f output/window.png
Error handling
- On macOS, run
bash <path-to-skill>/scripts/ensure_macos_permissions.shfirst to request Screen Recording in one place. - If you see "screen capture checks are blocked in the sandbox", "could not create image from display", or Swift
ModuleCachepermission errors in a sandboxed run, rerun the command with escalated permissions. - If macOS app/window capture returns no matches, run
--list-windows --app "AppName"and retry with--window-id, and make sure the app is visible on screen. - If Linux region/window capture fails, check tool availability with
command -v scrot,command -v gnome-screenshot, andcommand -v import. - If saving to the OS default location fails with permission errors in a sandbox, rerun the command with escalated permissions.
- Always report the saved file path in the response.
More from firecrawl/skills
firecrawl-build-scrape
Integrate Firecrawl `/scrape` into product code for single-page extraction. Use when an app already has a URL and needs markdown, HTML, links, screenshots, metadata, or structured page output. Prefer this skill over broader crawl patterns when the feature is page-level.
19.8Kfirecrawl-build-search
Integrate Firecrawl `/search` into product code and agent workflows. Use when an app needs discovery before extraction, when the feature starts with a query instead of a URL, or when the system should search the web and optionally hydrate result content.
19.7Kfirecrawl-build-interact
Integrate Firecrawl `/interact` into product code for dynamic pages and browser actions after scraping. Use when a feature needs clicks, form fills, pagination, authentication-aware flows, or other multi-step interactions that plain `/scrape` cannot complete.
19.7Kfirecrawl-build-onboarding
Get Firecrawl credentials and SDK setup into a project. Use when an application needs `FIRECRAWL_API_KEY`, when an agent should add Firecrawl to `.env`, when the user wants to authenticate Firecrawl for app code, or when choosing the first SDK and docs for a new Firecrawl integration. This skill includes its own browser auth flow, so it does not depend on the website onboarding skill.
19.7Kfirecrawl-build-map
Integrate Firecrawl `/map` into product code for URL discovery on a known site. Use when a feature needs to find pages before scraping or crawling, especially on large docs sites, blogs, or help centers where the exact target URLs are not known yet.
554firecrawl-build-crawl
Integrate Firecrawl `/crawl` into product code for bulk extraction across a site or site section. Use when a feature needs many related pages, such as documentation sets, help centers, or blogs, and page-by-page `/scrape` would be too manual.
554