Browser
Customization
Before executing, check for user customizations at:
~/.claude/PAI/USER/SKILLCUSTOMIZATIONS/Browser/
If this directory exists, load and apply any PREFERENCES.md, configurations, or resources found there. These override default behavior. If the directory does not exist, proceed with skill defaults.
MANDATORY: Voice Notification (REQUIRED BEFORE ANY ACTION)
You MUST send this notification BEFORE doing anything else when this skill is invoked.
-
Send voice notification:
curl -s -X POST http://localhost:31337/notify \ -H "Content-Type: application/json" \ -d '{"message": "Running the WORKFLOWNAME workflow in the Browser skill to ACTION"}' \ > /dev/null 2>&1 & -
Output text notification:
Running the **WorkflowName** workflow in the **Browser** skill to ACTION...
This is not optional. Execute this curl command immediately upon skill invocation.
Browser v10.0.0 — Browser Automation
Tool: agent-browser — headless Rust CLI daemon with persistent auth profiles.
If agent-browser isn't working or a site has bot detection, use the Interceptor skill instead. Interceptor is a Chrome extension with zero CDP fingerprint — passes all major bot detection checks.
Does the site need auth?
Use --profile ~/.agent-browser/profiles/<site>. If profile exists, auth is automatic. If not, run --headed once for login, then headless forever.
agent-browser
Native Rust daemon. Persistent profiles for auth. Headless by default.
Quick One-Shot Commands
agent-browser open https://example.com && agent-browser screenshot /tmp/shot.png
agent-browser open https://example.com && agent-browser screenshot --full /tmp/full.png
agent-browser open https://example.com && agent-browser pdf /tmp/page.pdf
Session-Based Interaction
# 1. OPEN
agent-browser open https://example.com
# 2. WORK
agent-browser snapshot # a11y tree with @eN refs (for AI)
agent-browser click @e12 # click by ref
agent-browser fill @e15 "hello" # fill input by ref
agent-browser screenshot /tmp/shot.png # screenshot
agent-browser eval "document.title" # run JS
# 3. CLOSE — when done
agent-browser close
Authenticated Browsing (Per-Site Profiles)
First-time setup (headed, one-time):
# Close any running daemon first
agent-browser close --all
# Launch headed with persistent profile — log in manually
agent-browser --headed --profile ~/.agent-browser/profiles/<site> open https://example.com
# After login completes, all future runs reuse the profile headlessly
Subsequent runs (headless, automatic):
agent-browser --profile ~/.agent-browser/profiles/<site> open https://example.com
# Auth is automatic — cookies, IndexedDB, cache all persist
To add a new site: Close daemon, run --headed --profile ~/.agent-browser/profiles/<name> once, log in, done.
Auth Vault (Alternative)
agent-browser auth save mysite --url https://example.com --username user --password-stdin
agent-browser auth login mysite # auto-fills login form
agent-browser auth list # show saved profiles
Batch Execution
# Send multiple commands in one shot (fewer tool calls = fewer tokens)
echo '[["open","https://example.com"],["snapshot"],["click","@e12"]]' | agent-browser batch
Advanced Features
# Connect to already-running Chrome
agent-browser --auto-connect snapshot
# Network interception
agent-browser route "**/*.{png,jpg}" abort # block images
agent-browser route "https://api.com/*" mock '{"data":"test"}'
# Device emulation
agent-browser --device "iPhone 15" open https://example.com
# Session persistence (cookies + localStorage by name)
agent-browser --session-name myapp open https://example.com
agent-browser Rules
- Daemon model — first command starts daemon, subsequent commands connect instantly.
- Refs use @eN syntax —
@e12note12. - Profiles persist everything — cookies, IndexedDB, cache, localStorage.
- Close with
agent-browser closeorclose --allto kill daemon.
Delegating Browser Work to Agents
When you need parallel or background browser work (scraping multiple pages, monitoring), spawn general-purpose agents with browser instructions. No dedicated browser agent type needed — this skill IS the expertise.
Agent(subagent_type="general-purpose", prompt="
Use agent-browser CLI for all browser work.
Commands: open <url>, snapshot, click @eN, fill @eN 'text', screenshot /path.
For authenticated sites: --profile ~/.agent-browser/profiles/<site>
Refs use @eN syntax from snapshots.
[your specific task instructions here]
")
For parallel isolation, each agent uses --session <name>:
Agent 1: agent-browser --session scrape1 open https://site-a.com
Agent 2: agent-browser --session scrape2 open https://site-b.com
Fallback: If agent-browser fails or the site has bot detection, use the Interceptor skill instead.
Legacy built-in agents — DEPRECATED, do not invoke. BrowserAgent and UIReviewer are Claude Code built-ins whose internals cannot be modified; they run browser automation that PAI no longer uses. Route all browser work through the Interceptor skill (verification, authenticated flows) or agent-browser (headless scraping).
Workflow Routing
| Trigger Words | Workflow | What It Does |
|---|---|---|
| "review stories", "run stories", "ui review", "validate stories" | Workflows/ReviewStories.md |
Fan out YAML stories to parallel UIReviewers |
| "automate", "recipe", "template", or a recipe name | Workflows/Automate.md |
Load and execute a parameterized recipe template |
| "update", "check version" | Workflows/Update.md |
Verify browser tools are current and working |
Stories — YAML User Story Validation
Define user stories in YAML and validate them in parallel with UIReviewer agents.
Directory: skills/Browser/Stories/
name: App Name
url: https://example.com
stories:
- name: Story name
steps:
- action: click
target: "LLM-readable description"
assertions:
- type: snapshot_contains
text: "expected text"
Run with: "review stories" or "run stories in HackerNews.yaml"
Recipes — Parameterized Templates
Reusable Markdown templates with {PROMPT} injection.
Directory: skills/Browser/Recipes/
| Recipe | Description | Tool |
|---|---|---|
SummarizePage.md |
Extract content summary | BrowserAgent |
ScreenshotCompare.md |
Before/after comparison | agent-browser |
FormFill.md |
Fill form fields | agent-browser |
Run with: "automate SummarizePage for https://example.com"
Execution Log
After completing any workflow, append a single JSONL entry:
echo '{"ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","skill":"Browser","workflow":"WORKFLOW_USED","input":"8_WORD_SUMMARY","status":"ok|error","duration_s":SECONDS}' >> ~/.claude/PAI/MEMORY/SKILLS/execution.jsonl
More from danielmiessler/personal_ai_infrastructure
osint
Structured OSINT investigations — people lookup, company intel, investment due diligence, entity/threat intel, domain recon, organization research using public sources with ethical authorization framework. USE WHEN OSINT, due diligence, background check, research person, company intel, investigate, company lookup, domain lookup, entity lookup, organization lookup, threat intel, discover OSINT sources.
260firstprinciples
Physics-based reasoning framework (Musk/Elon methodology) that deconstructs problems to irreducible fundamental truths rather than reasoning by analogy. Three-step structure: DECONSTRUCT (break to constituent parts and actual values), CHALLENGE (classify every element as hard constraint / soft constraint / unvalidated assumption — only physics is truly immutable), RECONSTRUCT (build optimal solution from fundamentals alone, ignoring inherited form). Outputs: constituent-parts breakdown, constraint classification table, and reconstructed solution with key insight. Three workflows: Deconstruct.md, Challenge.md, Reconstruct.md. Integrates with RedTeam (attack assumptions before deploying adversarial agents), Security (decompose threat model), Architecture (challenge design constraints), and Pentesters (decompose assumed security boundaries). Other skills invoke via: Challenge on all stated constraints → classify as hard/soft/assumption. Cross-domain synthesis: solutions from unrelated fields often apply once the fundamental truths are exposed. NOT FOR incident investigation and causal chains (use RootCauseAnalysis). NOT FOR structural feedback loops (use SystemsThinking). USE WHEN first principles, fundamental truths, challenge assumptions, is this a real constraint, rebuild from scratch, what are we actually paying for, what is this really made of, start over, physics first, question everything, reasoning by analogy, is this really necessary.
161documents
Read, write, convert, and analyze documents — routes to PDF, DOCX, XLSX, PPTX sub-skills for creation, editing, extraction, and format conversion. USE WHEN document, process file, create document, convert format, extract text, PDF, DOCX, XLSX, PPTX, Word, Excel, spreadsheet, PowerPoint, presentation, slides, consulting report, large PDF, merge PDF, fill form, tracked changes, redlining.
116redteam
Military-grade adversarial analysis that deploys 32 parallel expert agents (engineers, architects, pentesters, interns) to stress-test ideas, strategies, and plans — not systems or infrastructure. Two workflows: ParallelAnalysis (5-phase: decompose into 24 atomic claims → 32-agent parallel attack → synthesis → steelman → counter-argument, each 8 points) and AdversarialValidation (competing proposals synthesized into best solution). Context files: Philosophy.md (core principles, success criteria, agent types), Integration.md (how to combine with FirstPrinciples, Council, and other skills; output format). Targets arguments, not network vulnerabilities. Findings ranked by severity; goal is to strengthen, not destroy — weaknesses delivered with remediation paths. Collaborates with FirstPrinciples (decompose assumptions before attacking) and Council (Council debates to find paths; RedTeam attacks whatever survives). Also invoked internally by Ideate (TEST phase) and WorldThreatModel (horizon stress-testing). NOT FOR AI instruction set auditing (use BitterPillEngineering). NOT FOR network/system vulnerability testing (use a security assessment skill). USE WHEN red team, attack idea, counterarguments, critique, stress test, devil's advocate, find weaknesses, break this, poke holes, what could go wrong, strongest objection, adversarial validation, battle of bots.
115privateinvestigator
Ethical people-finding using 15 parallel research agents (45 search threads) across public records, social media, reverse lookups. Public data only, no pretexting. USE WHEN find person, locate, reconnect, people search, skip trace, reverse lookup, social media search, public records search, verify identity.
114council
Multi-agent collaborative debate that produces visible round-by-round transcripts with genuine intellectual friction. All council members are custom-composed via ComposeAgent (Agents skill) with domain expertise, unique voice, and personality tailored to the specific topic — never built-in generic types. ComposeAgent invoked as: bun run ~/.claude/skills/Agents/Tools/ComposeAgent.ts. Two workflows: DEBATE (3 rounds, full transcript + synthesis, parallel execution within rounds, 40-90 seconds total) and QUICK (1 round, fast perspective check). Context files: CouncilMembers.md (agent composition instructions), RoundStructure.md (three-round structure and timing), OutputFormat.md (transcript format templates). Agents are designed per debate topic to create real disagreement; 4-6 well-composed agents outperform 12 generic ones. Council is collaborative-adversarial (debate to find best path); for pure adversarial attack on an idea, use RedTeam instead. NOT FOR parallel task execution across agents (use Delegation skill). USE WHEN council, debate, multiple perspectives, weigh options, deliberate, get different views, multi-agent discussion, what would experts say, is there consensus, pros and cons from multiple angles.
113