actionbook

Summary

Pre-verified page actions and selectors for website automation without runtime discovery.

  • Search a library of documented page interactions by task intent, then retrieve structured DOM details with tested CSS selectors ready for browser commands
  • Browser commands cover navigation, form filling, clicking, text extraction, screenshots, and waiting for page changes
  • Handles login walls by pausing automation and asking the user to complete authentication manually in the same session
  • Daemon mode (Unix/CDP) maintains a persistent WebSocket connection per profile, eliminating per-command connection overhead
  • Falls back to live accessibility tree snapshots when selectors become outdated due to website changes
SKILL.md

When to Use This Skill

Activate when the user:

  • Needs to do anything on a website ("Send a LinkedIn message", "Book an Airbnb", "Search Google for...")
  • Asks how to interact with a site ("How do I post a tweet?", "How to apply on LinkedIn?")
  • Wants to fill out forms, click buttons, navigate, search, filter, or browse on a specific site
  • Wants to take a screenshot of a web page or monitor changes
  • Builds browser-based AI agents, web scrapers, or E2E tests for external websites
  • Automates repetitive web tasks (data entry, form submission, content posting)
  • Needs to operate multiple websites or tabs concurrently

How It Works

Actionbook provides up-to-date action manuals for the modern web. Action manuals tell agents exactly what to do on a page — no parsing, no guessing.

Why this matters:

  • 10x faster — action manuals provide selectors and page structure upfront. No snapshot-per-step loop needed.
  • Accurate — handles SPAs, streaming components, dropdowns, date pickers, and dynamic content reliably.
  • Concurrent — stateless architecture with explicit --session/--tab. Operate dozens of tabs in parallel.

The workflow:

  1. Start a browser session
  2. Navigate to the target page
  3. Snapshot to get the page structure with element refs
  4. Automate using refs from the snapshot

Run actionbook <command> --help for full usage and examples of any command.

Browser Automation

Every browser command is stateless — pass --session and --tab explicitly. No "current tab" — you can run commands on any session/tab in parallel.

Start a session

actionbook browser start --set-session-id s1

Core workflow: snapshot, act, wait

actionbook browser goto <url> --session s1 --tab t1
actionbook browser snapshot --session s1 --tab t1          # Get page structure with refs
actionbook browser fill @e3 "text" --session s1 --tab t1   # Use refs from snapshot
actionbook browser click @e7 --session s1 --tab t1
actionbook browser wait navigation --session s1 --tab t1   # Wait for page load

Snapshot refs

snapshot labels every element with a ref (e.g. @e3, @e7). Use these refs as selectors in any command — they are the recommended way to target elements.

Refs are stable across snapshots — if the element stays the same, the ref stays the same. This lets you chain multiple commands without re-snapshotting after every step.

Command categories

All commands support --help for full usage and examples.

Category Key commands Help
Session start, close, restart, list-sessions, status actionbook browser start --help
Tab new-tab, close-tab, list-tabs actionbook browser new-tab --help
Navigation goto, back, forward, reload actionbook browser goto --help
Observation snapshot, text, html, value, screenshot, title, url actionbook browser snapshot --help
Interaction click, fill, type, press, select, hover, scroll actionbook browser click --help
Wait wait element, wait navigation, wait network-idle, wait condition actionbook browser wait element --help
Cookies cookies list, cookies get, cookies set, cookies delete, cookies clear actionbook browser cookies list --help
Storage local-storage list|get|set|delete|clear, session-storage ... actionbook browser local-storage get --help
Logs logs console, logs errors actionbook browser logs console --help
Query query one|all|nth|count actionbook browser query --help

Full command reference: command-reference.md

Example: End-to-End

User request: "Find a room next week in SF on Airbnb"

actionbook browser start --set-session-id s1
actionbook browser goto "https://airbnb.com" --session s1 --tab t1
actionbook browser snapshot --session s1 --tab t1
actionbook browser fill @e3 "San Francisco" --session s1 --tab t1
actionbook browser click @e7 --session s1 --tab t1
actionbook browser wait navigation --session s1 --tab t1

Selectors

Selectors should come from actionbook browser snapshot — not from prior knowledge or memory. Always snapshot first to get current refs, then use those refs to interact with the page.

Login Page Handling

When you hit a login/auth wall (sign-in page, password prompt, MFA/OTP, CAPTCHA, account chooser):

  1. Pause automation and keep the current browser session open (same tab/profile/cookies).
  2. Ask the user to complete login manually in that same browser window.
  3. After user confirms login is done, continue in the same session.
  4. If the post-login page is different, run actionbook browser snapshot to get the new page structure before continuing.

Do not switch tools just because a login page appears.

References

Reference Description
command-reference.md Complete command reference with all flags and options
authentication.md Login flows, OAuth, 2FA handling, session persistence
Weekly Installs
1.4K
GitHub Stars
1.5K
First Seen
Jan 23, 2026
Installed on
opencode1.3K
codex1.3K
gemini-cli1.3K
github-copilot1.3K
kimi-cli1.3K
amp1.3K