cli-anything-browser
cli-anything-browser
A command-line interface for browser automation using DOMShell's MCP server. Navigate web pages using filesystem commands: ls, cd, cat, grep, click.
Installation
Prerequisites
-
Node.js and npx (for DOMShell MCP server):
# Install Node.js from https://nodejs.org/ npx --version -
Chrome/Chromium with DOMShell extension:
- Install extension in Chrome
- Ensure Chrome is running before using CLI
-
Python 3.10+
Install CLI
cd browser/agent-harness
pip install -e .
Command Groups
page — Page Navigation
page open <url>— Navigate to URLpage reload— Reload current pagepage back— Navigate back in historypage forward— Navigate forward in historypage info— Show current page info
fs — Filesystem Commands (Accessibility Tree)
fs ls [path]— List elements at pathfs cd <path>— Change directoryfs cat [path]— Read element contentfs grep <pattern> [path]— Search for text patternfs pwd— Print working directory
act — Action Commands
act click <path>— Click an elementact type <path> <text>— Type text into input
session — Session Management
session status— Show session statesession daemon-start— Start persistent daemon modesession daemon-stop— Stop daemon mode
Usage Examples
Basic Navigation
# Open a page
cli-anything-browser page open https://example.com
# Explore structure
cli-anything-browser fs ls /
cli-anything-browser fs cd /main
cli-anything-browser fs ls
# Go back to root
cli-anything-browser fs cd /
Search and Click
cli-anything-browser fs grep "Login"
cli-anything-browser act click /main/button[0]
Form Fill
cli-anything-browser act type /main/input[0] "user@example.com"
cli-anything-browser act click /main/button[0]
JSON Output
cli-anything-browser --json fs ls /
Daemon Mode (Faster Interactive Use)
# Start persistent connection
cli-anything-browser session daemon-start
# Run commands (uses persistent connection)
cli-anything-browser fs ls /
cli-anything-browser fs cd /main
# Stop daemon when done
cli-anything-browser session daemon-stop
Interactive REPL
cli-anything-browser
Path Syntax
DOMShell uses a filesystem-like path for the Accessibility Tree:
/ — Root (document)
/main — Main landmark
/main/div[0] — First div in main
/main/div[0]/button[2] — Third button in first div
- Array indices are 0-based:
button[0]is the first button - Use
..to go up one level - Use
/for root
Agent-Specific Guidance
JSON Output for Parsing
All commands support --json flag for machine-readable output:
cli-anything-browser --json fs ls /
Returns:
{
"path": "/",
"entries": [
{"name": "main", "role": "landmark", "path": "/main"}
]
}
Error Handling
The CLI provides clear error messages for common issues:
- npx not found: Install Node.js from https://nodejs.org/
- DOMShell not found: Run
npx @apireno/domshell --version - MCP call failed: Install DOMShell Chrome extension
Check is_available() return value before running commands.
Daemon Mode for Efficiency
For agent workflows with multiple commands, use daemon mode:
- Start daemon:
cli-anything-browser session daemon-start - Run commands: Each command reuses the MCP connection
- Stop daemon:
cli-anything-browser session daemon-stop
This avoids the 1-3 second cold start overhead for each command.
Links
Security Considerations
IMPORTANT: When using this CLI with AI agents, be aware of the following security considerations:
URL Restrictions
The browser harness validates all URLs before navigation:
- Explicit scheme required: URLs must include
http://orhttps://scheme (scheme-less URLs likeexample.comare rejected) - Blocked schemes:
file://,javascript://,data://,vbscript://,about://,chrome://, and browser-internal schemes - Allowed schemes:
http://andhttps://only (configurable viaCLI_ANYTHING_BROWSER_ALLOWED_SCHEMES) - Private network blocking: Optional via
CLI_ANYTHING_BROWSER_BLOCK_PRIVATE=true(disabled by default)
DOM Content Risks
The Accessibility Tree includes all visible and hidden elements on a page. Malicious websites could:
- Craft ARIA labels with manipulative text (e.g., "Ignore previous instructions")
- Use aria-hidden elements to inject content not visible to users
- Create confusing DOM structures that mislead navigation
Mitigation: When interacting with untrusted websites, consider:
- Using the
--jsonflag for structured output that's easier to parse safely - Sanitizing or filtering DOM content before including it in prompts
- Limiting navigation to trusted domains
Private Network Access
By default, the browser can access localhost and private networks (192.168.x.x, 10.x.x.x, etc.). To block:
export CLI_ANYTHING_BROWSER_BLOCK_PRIVATE=true
cli-anything-browser page open http://localhost:8080 # Will be blocked
Session Isolation
Multiple browser sessions share the same Chrome instance. Cookies and authentication state may persist across sessions. For sensitive operations, consider:
- Using Chrome's guest mode or incognito
- Clearing cookies between sessions
- Using separate Chrome profiles for different security contexts
More from hkuds/cli-anything
cli-anything
Use when the user wants Codex to build, refine, test, or validate a CLI-Anything harness for a GUI application or source repository. Adapts the CLI-Anything methodology to Codex without changing the generated Python harness format.
1.1Kcli-hub-meta-skill
>-
494cli-anything-blender
>-
172cli-anything-obsidian
>-
159cli-anything-drawio
>-
148cli-anything-libreoffice
>-
133