tauri-mcp-bridge
Tauri MCP Bridge — AI Agent ↔ Tauri App
Architecture: Two-Part System
The MCP bridge consists of two parts that MUST both be installed:
Part 1: Node.js MCP Server (runs on the AI client side)
- npm package:
@hypothesi/tauri-mcp-serverv0.9.0 - Exposes 20 tools + 3 slash commands to the AI client
- Communicates with the Tauri plugin via WebSocket
Part 2: Rust Plugin (runs inside the Tauri app)
- Crate:
tauri-plugin-mcp-bridgev0.9.0 - Starts a WebSocket server on port 9223 (scans up to 9322 if busy)
- Handles commands: IPC monitoring, JS execution, screenshots, window management
┌─────────────────────┐ WebSocket ┌─────────────────────┐
│ AI Client │ (port 9223) │ Tauri App │
│ (Claude/Cursor/ │◄───────────────────────────►│ (cargo tauri dev) │
│ Windsurf/etc.) │ │ │
│ │ │ Rust Plugin: │
│ MCP Server: │ ┌─────────────────┐ │ tauri-plugin- │
│ @hypothesi/ │ │ Commands: │ │ mcp-bridge │
│ tauri-mcp-server │──►│ screenshot │──────►│ │
│ │ │ execute_js │ │ WebSocket Server │
│ 20 tools │ │ ipc_monitor │ │ ├─ dispatch_cmd │
│ 3 slash commands │ │ find_element │ │ ├─ IPC monitor │
│ │ │ ... │ │ └─ script registry │
└─────────────────────┘ └─────────────────┘ └─────────────────────┘
Neither side works alone. The MCP server without the plugin has no app to connect to. The plugin without the MCP server has no AI client to serve.
Slash Commands
The MCP server provides 3 slash commands for quick agent interaction:
| Command | Description |
|---|---|
/setup |
Runs setup diagnostics — checks if Tauri app is running, verifies WebSocket, reports config issues, offers to fix problems |
/fix-webview-errors |
Diagnoses and fixes common WebView errors — missing withGlobalTauri, permission failures, script injection issues |
/select |
Activates the visual element picker — user clicks an element, agent receives full metadata (tag, id, classes, attributes, bounding rect, CSS selector, XPath, computed styles, parent chain, screenshot) |
Selector Strategy Parameter
Many webview tools accept a strategy parameter that controls how selector is interpreted:
| Strategy | Value | Example Selector | Description |
|---|---|---|---|
| CSS (default) | "css" |
"#login-btn", ".card > h2" |
Standard CSS selectors |
| XPath | "xpath" |
"//button[@type='submit']" |
XPath 1.0 expressions |
| Text content | "text" |
"Sign In" |
Matches elements containing exact text |
| ARIA label | "aria-label" |
"Close dialog" |
Matches aria-label attribute value |
| Ref ID | "ref" |
"ref-42" |
Uses ref IDs from webview_dom_snapshot accessibility output |
Tools supporting strategy: webview_dom_snapshot, webview_get_styles, webview_find_element, webview_interact, webview_wait_for
The ref strategy is especially useful: call webview_dom_snapshot with type: "accessibility", get ref IDs from the YAML output, then use strategy: "ref" with those IDs to target specific elements without fragile CSS selectors.
Tool Categories — Complete Schema Reference
SETUP & CONFIG (3 tools)
get_setup_instructions
Returns setup instructions for the MCP bridge plugin.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| (none) | — | — | — | No parameters |
Returns: Text with installation steps covering Cargo.toml, lib.rs, tauri.conf.json, and capabilities setup.
The returned text covers:
- Adding the crate dependency
- Registering the plugin in the Tauri builder
- Enabling
withGlobalTauriin tauri.conf.json - Adding
mcp-bridge:defaultpermission to capabilities - Running
cargo tauri devand verifying the WebSocket server starts
driver_session
Manages bridge session lifecycle. MUST be called with action: "start" before using any other tool.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
action |
string | ✅ | — | "start" | "stop" | "status" |
Session lifecycle action |
host |
string | ❌ | "127.0.0.1" |
Valid hostname/IP | WebSocket host to connect to |
port |
number | ❌ | 9223 |
1–65535 | WebSocket port |
appIdentifier |
string | number | ❌ | — | Port number or bundle ID | Target a specific running app |
list_devices
Lists connected Android/iOS devices and emulators.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| (none) | — | — | — | No parameters |
UI AUTOMATION & WEBVIEW (11 tools)
webview_screenshot
Captures a screenshot of the current WebView state.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
maxWidth |
number | ❌ | — | Positive integer | Resize to max width (preserves aspect ratio) |
filePath |
string | ❌ | — | Valid file path | Save screenshot to disk |
format |
string | ❌ | "png" |
"png" | "jpeg" |
Image format |
quality |
number | ❌ | — | 1–100 | JPEG quality (only for jpeg format) |
windowId |
string | ❌ | — | Window label | Target specific window |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
Annotations: readOnlyHint: true, destructiveHint: false, openWorldHint: false
webview_dom_snapshot
Returns YAML accessibility tree or DOM structure snapshot.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
type |
string | ❌ | "accessibility" |
"accessibility" | "structure" |
Snapshot format |
selector |
string | ❌ | — | Depends on strategy | Root element to snapshot from |
strategy |
string | ❌ | "css" |
"css" | "xpath" | "text" | "aria-label" | "ref" |
Selector interpretation |
windowId |
string | ❌ | — | Window label | Target specific window |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
Accessibility output includes: roles, names, states, and ref IDs usable with strategy: "ref" in other tools.
webview_get_styles
Gets computed CSS styles for a specific element.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
selector |
string | ✅ | — | Depends on strategy | Target element |
properties |
string[] | ❌ | all | CSS property names | Specific properties to return |
strategy |
string | ❌ | "css" |
"css" | "xpath" | "text" | "aria-label" | "ref" |
Selector interpretation |
windowId |
string | ❌ | — | Window label | Target specific window |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
webview_find_element
Finds elements matching a selector.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
selector |
string | ✅ | — | Depends on strategy | Elements to find |
strategy |
string | ❌ | "css" |
"css" | "xpath" | "text" | "aria-label" | "ref" |
Selector interpretation |
windowId |
string | ❌ | — | Window label | Target specific window |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
webview_interact
Performs UI interactions — click, type, scroll, hover, clear, select, focus.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
selector |
string | ✅ | — | Depends on strategy | Target element |
action |
string | ✅ | — | "click" | "type" | "scroll" | "hover" | "clear" | "select" | "focus" |
Interaction type |
value |
string | ❌ | — | — | Text for "type" action, option value for "select" action |
scrollDirection |
string | ❌ | — | "up" | "down" |
Direction for "scroll" action |
scrollAmount |
number | ❌ | — | Positive integer | Pixels to scroll |
strategy |
string | ❌ | "css" |
"css" | "xpath" | "text" | "aria-label" | "ref" |
Selector interpretation |
windowId |
string | ❌ | — | Window label | Target specific window |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
Annotations: readOnlyHint: false, destructiveHint: false, openWorldHint: false
webview_keyboard
Sends keyboard input — individual key presses with modifiers.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
action |
string | ✅ | — | "press" | "down" | "up" | "type" |
Key event type |
key |
string | ✅ | — | Key name (e.g. "Enter", "a", "Tab", "Escape") |
Key to send |
modifiers |
string[] | ❌ | [] |
"ctrl" | "shift" | "alt" | "meta" |
Modifier keys held during action |
windowId |
string | ❌ | — | Window label | Target specific window |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
webview_wait_for
Waits for an element to reach a specified condition.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
selector |
string | ✅ | — | Depends on strategy | Element to wait for |
state |
string | ❌ | "visible" |
"visible" | "hidden" | "attached" | "detached" |
Condition to wait for |
timeout |
number | ❌ | — | Positive integer (ms) | Maximum wait time in milliseconds |
strategy |
string | ❌ | "css" |
"css" | "xpath" | "text" | "aria-label" | "ref" |
Selector interpretation |
windowId |
string | ❌ | — | Window label | Target specific window |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
webview_execute_js
Executes arbitrary JavaScript in the WebView context.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
script |
string | ✅ | — | Valid JavaScript | JS code to execute |
windowId |
string | ❌ | — | Window label | Target specific window |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
Script must return a value. Wrap in IIFE if needed: (function() { ...; return result; })()
webview_select_element
Visual element picker — shows an overlay, user clicks to select an element.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
timeout |
number | ❌ | 60000 |
5000–120000 ms | How long the picker stays active |
windowId |
string | ❌ | — | Window label | Target specific window |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
Returns: Rich metadata — tag, id, classes, attributes, text content, bounding rect, CSS selector, XPath, computed styles, parent chain, element screenshot.
webview_get_pointed_element
Gets element metadata from a previous Alt+Shift+Click by the user.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
windowId |
string | ❌ | — | Target specific window |
appIdentifier |
string | number | ❌ | — | Target specific app |
Returns: Same metadata structure as webview_select_element.
WINDOW MANAGEMENT (1 tool)
manage_window
Lists windows, gets window info, or resizes windows.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
action |
string | ✅ | — | "list" | "info" | "resize" |
Window operation |
windowId |
string | ❌ | — | Window label | Target window (required for "info"/"resize") |
width |
number | ❌ | — | Positive integer | New width for "resize" |
height |
number | ❌ | — | Positive integer | New height for "resize" |
logical |
boolean | ❌ | true |
— | Use logical (true) vs physical (false) pixels |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
IPC & PLUGIN (5 tools)
ipc_execute_command
Invokes any registered Tauri command as if called from the frontend.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
command |
string | ✅ | — | Tauri command name |
args |
unknown | ❌ | — | Command arguments as JSON |
appIdentifier |
string | number | ❌ | — | Target specific app |
Annotations: readOnlyHint: false, destructiveHint: false, openWorldHint: false
ipc_monitor
Starts or stops IPC call interception.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
action |
string | ✅ | — | "start" | "stop" |
appIdentifier |
string | number | ❌ | — | Target specific app |
ipc_get_captured
Returns captured IPC events since monitoring started.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
filter |
string | ❌ | — | Filter captured events by command name substring |
appIdentifier |
string | number | ❌ | — | Target specific app |
ipc_emit_event
Emits a Tauri event to the application.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
eventName |
string | ✅ | — | Event name to emit |
payload |
unknown | ❌ | — | Event payload (any JSON-serializable value) |
appIdentifier |
string | number | ❌ | — | Target specific app |
ipc_get_backend_state
Gets application metadata, version, environment, and registered plugins.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
appIdentifier |
string | number | ❌ | — | Target specific app |
LOG READING (1 tool)
read_logs
Reads logs from multiple sources: WebView console, Android logcat, iOS device logs, or system logs.
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
source |
string | ✅ | — | "console" | "android" | "ios" | "system" |
Log source |
lines |
number | ❌ | 50 |
Positive integer | Number of log lines to return |
filter |
string | ❌ | — | Substring match | Filter logs by content |
since |
string | ❌ | — | ISO 8601 timestamp | Only return logs after this time |
windowId |
string | ❌ | — | Window label | Target specific window |
appIdentifier |
string | number | ❌ | — | Port or bundle ID | Target specific app |
Source types:
"console"— WebViewconsole.log/warn/erroroutput"android"— Android logcat (requires connected device/emulator)"ios"— iOS device/simulator logs"system"— System-level application logs
Installation Phase 1: AI Client Side
Automatic (recommended)
npx -y install-mcp @hypothesi/tauri-mcp-server --client claude-code
Replace claude-code with your client: cursor, windsurf, vscode, cline, roo-cline, claude, zed, goose, warp, codex.
Manual Configuration
Add to your MCP configuration file:
{
"mcpServers": {
"tauri": {
"command": "npx",
"args": ["-y", "@hypothesi/tauri-mcp-server"]
}
}
}
Config file locations by client:
| Client | Config File Location |
|---|---|
| Claude Code | ~/.claude/claude_desktop_config.json or project .mcp.json |
| Cursor | .cursor/mcp.json in project root |
| VS Code | .vscode/mcp.json in project root |
| Windsurf | ~/.windsurf/mcp.json |
| Claude Desktop | ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) |
| Claude Desktop | %APPDATA%\Claude\claude_desktop_config.json (Windows) |
CRITICAL: After adding the MCP server config, you must fully quit and relaunch the AI client. A simple window reload is NOT enough — the MCP server process needs to be spawned fresh.
Installation Phase 2: Rust Plugin
Add the crate
cd src-tauri && cargo add tauri-plugin-mcp-bridge
Or manually in src-tauri/Cargo.toml:
[dependencies]
tauri-plugin-mcp-bridge = "0.9"
If the dependency already exists: Check the version. If it's older than 0.9, update it. If it's 0.9+, skip this step.
Register in lib.rs or main.rs
Standard pattern (lib.rs):
pub fn run() {
let mut builder = tauri::Builder::default();
#[cfg(debug_assertions)]
{
builder = builder.plugin(tauri_plugin_mcp_bridge::init());
}
builder
.run(tauri::generate_context!())
.expect("error while running tauri application");
}
Main.rs pattern (some projects use main.rs instead):
fn main() {
let mut builder = tauri::Builder::default();
#[cfg(debug_assertions)]
{
builder = builder.plugin(tauri_plugin_mcp_bridge::init());
}
builder
.run(tauri::generate_context!())
.expect("error while running tauri application");
}
Tauri v2 mobile entry point (projects with src-tauri/src/lib.rs using #[cfg_attr(mobile, ...)]):
#[cfg_attr(mobile, tauri::mobile_entry_point)]
pub fn run() {
let mut builder = tauri::Builder::default();
#[cfg(debug_assertions)]
{
builder = builder.plugin(tauri_plugin_mcp_bridge::init());
}
builder
.run(tauri::generate_context!())
.expect("error while running tauri application");
}
Edge case — builder is chained, not reassigned: If the project uses a chained builder pattern:
// BEFORE (chained):
tauri::Builder::default()
.plugin(some_other_plugin::init())
.run(tauri::generate_context!())
.expect("error");
// AFTER (must break the chain to insert conditional):
let mut builder = tauri::Builder::default()
.plugin(some_other_plugin::init());
#[cfg(debug_assertions)]
{
builder = builder.plugin(tauri_plugin_mcp_bridge::init());
}
builder
.run(tauri::generate_context!())
.expect("error");
Builder Pattern (custom configuration)
use tauri_plugin_mcp_bridge::Builder;
#[cfg(debug_assertions)]
{
let mcp_plugin = Builder::new()
.bind_address("127.0.0.1") // default: "0.0.0.0"
.base_port(9323) // default: 9223
.build();
builder = builder.plugin(mcp_plugin);
}
Feature flag alternative to cfg(debug_assertions)
For more control over when the MCP bridge is active:
# src-tauri/Cargo.toml
[features]
mcp-debug = ["tauri-plugin-mcp-bridge"]
[dependencies]
tauri-plugin-mcp-bridge = { version = "0.9", optional = true }
// lib.rs
#[cfg(feature = "mcp-debug")]
{
builder = builder.plugin(tauri_plugin_mcp_bridge::init());
}
Then run with: cargo tauri dev --features mcp-debug
Security: Default bind address is
0.0.0.0(all interfaces) to support remote device testing. Use127.0.0.1for localhost-only access when not testing on mobile devices.
Installation Phase 3: tauri.conf.json
Add withGlobalTauri: true inside the "app" section:
{
"app": {
"withGlobalTauri": true,
"windows": [
{
"label": "main",
"title": "My App",
"width": 800,
"height": 600
}
]
}
}
What it does: Enables window.__TAURI__ injection into the WebView, which the MCP bridge scripts use to call Tauri APIs from injected JavaScript.
Location anti-patterns:
- ❌ NOT in
"build"section (that's for build commands)- ❌ NOT in
"tauri"section (that's the Tauri v1 location — does not exist in v2)- ✅ ONLY in
"app"section at the top level
If withGlobalTauri is already true: Skip this step — no action needed.
Installation Phase 4: Capabilities
Find the active capabilities file in src-tauri/capabilities/. It's usually default.json but could be named differently (e.g., main.json, desktop.json). Look for the file containing "core:default" in its permissions array.
{
"$schema": "../gen/schemas/desktop-schema.json",
"identifier": "default",
"description": "Capability for the main window",
"windows": ["main"],
"permissions": [
"core:default",
"mcp-bridge:default"
]
}
If the capabilities file doesn't exist
Some projects use TOML capabilities or put permissions in tauri.conf.json. Create src-tauri/capabilities/default.json:
{
"identifier": "default",
"description": "Default capabilities",
"windows": ["main"],
"permissions": [
"core:default",
"mcp-bridge:default"
]
}
Custom window labels
If the app's main window has a label other than "main" (check tauri.conf.json → app.windows[].label), update the "windows" array to match:
{
"windows": ["app-window"],
"permissions": ["core:default", "mcp-bridge:default"]
}
Granular Permissions (for security-conscious users)
Instead of "mcp-bridge:default", you can specify individual permissions:
"permissions": [
"core:default",
"mcp-bridge:allow-capture-native-screenshot",
"mcp-bridge:allow-execute-js",
"mcp-bridge:allow-get-window-info",
"mcp-bridge:allow-list-windows",
"mcp-bridge:allow-start-ipc-monitor",
"mcp-bridge:allow-stop-ipc-monitor",
"mcp-bridge:allow-get-ipc-events",
"mcp-bridge:allow-emit-event",
"mcp-bridge:allow-execute-command",
"mcp-bridge:allow-get-backend-state",
"mcp-bridge:allow-report-ipc-event",
"mcp-bridge:allow-request-script-injection",
"mcp-bridge:allow-script-result"
]
Multi-Window Scoping
To restrict MCP bridge access to specific windows:
{
"identifier": "mcp-debug",
"description": "MCP bridge for dev window only",
"windows": ["dev-panel"],
"permissions": ["mcp-bridge:default"]
}
Breaking change (v0.7.0): The permission identifier was renamed from
"mcp-bridge:allow-all"to"mcp-bridge:default". If upgrading from v0.6.x, update your capabilities file.
Verification Sequence
After completing all 4 installation phases:
Step 1: Start the Tauri app
cargo tauri dev
Expected terminal output:
Compiling tauri-plugin-mcp-bridge v0.9.0
...
Finished `dev` profile [unoptimized + debuginfo] target(s)
MCP Bridge WebSocket server listening on 0.0.0.0:9223
If you see a different port (e.g., 9224), another process was using 9223. The MCP server will auto-detect the correct port.
Step 2: Verify tools are loaded in AI client
Ask the AI client: "What Tauri MCP tools do you have available?"
Expected output: 20 tools listed: get_setup_instructions, driver_session, list_devices, webview_screenshot, webview_dom_snapshot, webview_get_styles, webview_find_element, webview_interact, webview_keyboard, webview_wait_for, webview_execute_js, webview_select_element, webview_get_pointed_element, manage_window, ipc_execute_command, ipc_monitor, ipc_get_captured, ipc_emit_event, ipc_get_backend_state, read_logs.
Step 3: Use /setup slash command
Type /setup in the AI client. The setup tool will:
- Check if a Tauri app is running and reachable
- Verify the WebSocket connection
- Report any configuration issues
- Offer to fix common problems
Expected output on success:
✓ Connected to Tauri app on port 9223
✓ WebSocket connection established
✓ App identifier: com.example.myapp
Step 4: Test with webview_screenshot
Ask the AI: "Take a screenshot of my Tauri app"
- Success: Returns a base64-encoded image of the app's current WebView state
- Failure (blank/black): macOS Screen Recording permission issue (see Troubleshooting)
- Failure (connection error): The app isn't running or the port is blocked
Step 5: Test with webview_dom_snapshot
Ask the AI: "Show me the accessibility tree of my app"
- Success: Returns YAML with elements, roles, names, states, and ref IDs
- Failure: Usually
withGlobalTaurinot set, or capabilities missing
Agent Decision Tree
Use this flowchart to pick the right tool for any task:
What does the agent need to do?
│
├─► SEE the app's current state
│ ├─► Visual appearance? ──────────► webview_screenshot
│ ├─► Element structure? ──────────► webview_dom_snapshot (type: "structure")
│ ├─► Accessibility info? ─────────► webview_dom_snapshot (type: "accessibility")
│ ├─► Specific element's styles? ──► webview_get_styles
│ └─► Find a specific element? ────► webview_find_element
│
├─► INTERACT with the app
│ ├─► Click a button/link? ────────► webview_interact (action: "click")
│ ├─► Type into an input? ─────────► webview_interact (action: "type")
│ ├─► Scroll the page? ───────────► webview_interact (action: "scroll")
│ ├─► Hover over element? ─────────► webview_interact (action: "hover")
│ ├─► Clear an input field? ───────► webview_interact (action: "clear")
│ ├─► Select dropdown option? ─────► webview_interact (action: "select")
│ ├─► Focus an element? ──────────► webview_interact (action: "focus")
│ ├─► Press a key combo? ─────────► webview_keyboard
│ ├─► Wait for UI update? ────────► webview_wait_for
│ └─► Run custom JS? ─────────────► webview_execute_js
│
├─► IDENTIFY elements
│ ├─► User points at element? ─────► /select or webview_select_element
│ ├─► User already Alt+Shift+Clicked? ► webview_get_pointed_element
│ └─► Agent needs to find by selector? ► webview_find_element
│
├─► CALL backend / Rust commands
│ ├─► Invoke a Tauri command? ─────► ipc_execute_command
│ ├─► Emit an event? ─────────────► ipc_emit_event
│ └─► Get app metadata? ──────────► ipc_get_backend_state
│
├─► MONITOR what's happening
│ ├─► Start capturing IPC? ────────► ipc_monitor (action: "start")
│ ├─► Read captured IPC? ──────────► ipc_get_captured
│ ├─► Stop capturing? ────────────► ipc_monitor (action: "stop")
│ └─► Read logs? ─────────────────► read_logs
│
├─► MANAGE windows
│ ├─► List all windows? ──────────► manage_window (action: "list")
│ ├─► Get window details? ────────► manage_window (action: "info")
│ └─► Resize a window? ──────────► manage_window (action: "resize")
│
└─► SET UP the connection
├─► First time setup? ──────────► get_setup_instructions or /setup
├─► Connect to app? ────────────► driver_session (action: "start")
├─► Check connection? ──────────► driver_session (action: "status")
├─► List mobile devices? ───────► list_devices
└─► Diagnose WebView issues? ──► /fix-webview-errors
Common Agent Workflows
1. Initial App Exploration
driver_session({action: "start"})
→ webview_screenshot
→ webview_dom_snapshot({type: "accessibility"})
→ Now the agent knows what the app looks like AND its element structure
2. Click a Button and Verify Result
webview_find_element({selector: "#submit-btn"})
→ webview_interact({selector: "#submit-btn", action: "click"})
→ webview_wait_for({selector: ".success-message", state: "visible"})
→ webview_screenshot (confirm the result visually)
3. Fill Out a Form
webview_interact({selector: "#email", action: "clear"})
→ webview_interact({selector: "#email", action: "type", value: "test@example.com"})
→ webview_interact({selector: "#password", action: "type", value: "secret123"})
→ webview_interact({selector: "button[type='submit']", action: "click"})
→ webview_wait_for({selector: ".dashboard", state: "visible"})
4. Debug a UI Issue Using Accessibility Tree
webview_dom_snapshot({type: "accessibility"})
→ identify ref IDs from YAML output
→ webview_get_styles({selector: "ref-42", strategy: "ref"})
→ webview_find_element({selector: "ref-42", strategy: "ref"})
→ diagnose style/layout issues
5. Test a Rust Backend Command
ipc_get_backend_state
→ (learn available commands/plugins)
→ ipc_execute_command({command: "get_user", args: {id: 42}})
→ verify response matches expectations
6. Monitor IPC During a Workflow
ipc_monitor({action: "start"})
→ webview_interact({selector: "#save-btn", action: "click"})
→ webview_wait_for({selector: ".saved-indicator", state: "visible", timeout: 5000})
→ ipc_get_captured
→ ipc_monitor({action: "stop"})
→ analyze which commands were called and with what args
7. User Points at Element → Agent Fixes It
/select (user clicks an element in the app)
→ agent receives: tag, id, classes, attributes, CSS selector, XPath, computed styles
→ agent reads the source file containing that component
→ agent makes the fix
→ webview_screenshot (verify the fix)
8. Multi-Window Testing
manage_window({action: "list"})
→ manage_window({action: "info", windowId: "settings"})
→ webview_screenshot({windowId: "settings"})
→ webview_interact({selector: "#theme-toggle", action: "click", windowId: "settings"})
→ webview_screenshot({windowId: "main"}) (verify theme changed in main window too)
9. Mobile Log Debugging
list_devices
→ driver_session({action: "start", host: "192.168.1.100", port: 9223})
→ read_logs({source: "android", lines: 100, filter: "ERROR"})
→ diagnose the error
→ read_logs({source: "console", lines: 50}) (check WebView console too)
10. Responsive Design Testing
manage_window({action: "resize", windowId: "main", width: 375, height: 812})
→ webview_screenshot (mobile layout)
→ manage_window({action: "resize", windowId: "main", width: 1920, height: 1080})
→ webview_screenshot (desktop layout)
→ compare layouts, identify responsive issues
11. Event-Driven Feature Testing
ipc_monitor({action: "start"})
→ ipc_emit_event({eventName: "user:logout", payload: {reason: "session_expired"}})
→ webview_wait_for({selector: ".login-screen", state: "visible", timeout: 3000})
→ webview_screenshot
→ ipc_get_captured (verify the app handled the event correctly)
→ ipc_monitor({action: "stop"})
12. Custom JavaScript Diagnostic
webview_execute_js({script: "JSON.stringify(window.__TAURI__)"})
→ (verify Tauri APIs are available)
webview_execute_js({script: "document.querySelectorAll('[data-testid]').length"})
→ (count test IDs for element targeting)
webview_execute_js({script: "(function(){ return {url: location.href, title: document.title, readyState: document.readyState} })()"})
→ (get page state info)
When NOT to Use This Skill
This skill is specifically for interacting with a running Tauri v2 app via MCP. Do NOT use it when:
| Scenario | Use Instead |
|---|---|
| Editing Tauri source code (Rust, TS, HTML, CSS) | Standard file editing tools |
Building the Tauri app (cargo tauri build) |
Shell commands |
| Reading Tauri config files statically | File reading tools |
| Tauri v1 apps | Not supported — v1 has different plugin architecture |
| Electron, React Native, or Flutter apps | Different MCP bridges exist for those |
| The app is not currently running | Start it first with cargo tauri dev |
| Production/release builds | MCP bridge should ONLY be in debug builds |
| Web-only apps (no Tauri) | Browser DevTools MCP or similar |
Editing tauri.conf.json structure |
File editing tools (but use this skill's knowledge for correct structure) |
Managing Tauri CLI (cargo tauri subcommands) |
Shell commands directly |
Platform-Specific Gotchas
macOS Screen Recording Permission
webview_screenshot uses native screenshot APIs on macOS that require Screen Recording permission.
Which app needs the permission: The terminal application running the AI client (Terminal.app, iTerm2, Warp, VS Code, etc.) — NOT the Tauri app itself.
How to grant:
- Open System Settings → Privacy & Security → Screen Recording
- Enable your terminal application
- Fully quit and relaunch the terminal — the permission change does NOT take effect until restart
- Restart the AI client inside the relaunched terminal
Symptom if missing: webview_screenshot returns a blank or black image with no error. The tool reports success but the image content is empty.
Android Testing
adb forward tcp:9223 tcp:9223
Keep bind_address as "0.0.0.0" (required for the device to reach the WebSocket server on the host).
For devices connected via USB or WiFi, the MCP server connects through the forwarded port.
Windows
No special setup required. Screenshots and all tools work out of the box.
Linux
Native screenshots are NOT yet implemented due to webkit2gtk/glib version conflicts. Screenshots fall back to JavaScript-based capture (html2canvas-pro), which works but may not capture elements outside the WebView viewport.
Multi-App Support
The MCP bridge supports connecting to multiple Tauri apps simultaneously.
- Port range: 9223–9322 (base port 9223, scans up to 100 ports)
- Default app: The most recently connected app is used when no
appIdentifieris specified - Targeting specific app: Pass
appIdentifierparameter (port number or bundle ID) to any tool - Status:
driver_session({action: "status"})returns an array of all connected sessions - Stop all:
driver_session({action: "stop"})without identifier stops all sessions
For remote devices, configure the host:
driver_session({action: "start", host: "192.168.1.100", port: 9223})
Breaking Change Registry
| Version | Change | Migration Required |
|---|---|---|
| v0.7.0 | Permission "mcp-bridge:allow-all" → "mcp-bridge:default" |
Update capabilities/default.json |
| v0.8.0 | tauri_ prefix dropped from all tool names (e.g., tauri_driver_session → driver_session) |
Update any hardcoded tool name references |
| v0.8.1 | html2canvas replaced with html2canvas-pro |
No action — fixes oklch() CSS screenshot failures |
| v0.8.2 | Window timing race condition fix (exponential backoff) | No action — automatic retry on connection |
| v0.9.0 | Added webview_select_element and webview_get_pointed_element |
No action — new tools only |
Security Considerations
-
#[cfg(debug_assertions)]guard — NEVER remove. The MCP bridge opens a WebSocket server that allows full app control. Shipping this in a release build exposes IPC execution, JS injection, and screenshot capabilities to anyone who can reach the port. -
0.0.0.0binding exposes to LAN. Default bind address allows connections from any network interface. UseBuilder::new().bind_address("127.0.0.1").build()when not testing on mobile devices. -
mcp-bridge:defaultin dev capabilities ONLY. Create a separate capabilities file for development (dev.json) or use conditional capabilities. Never shipmcp-bridge:defaultpermission in production capabilities. -
ipc_execute_commandcan call ANY registered Tauri command. This tool bypasses all frontend validation and calls Rust commands directly with arbitrary arguments. Only use on trusted development machines. -
webview_execute_jsruns arbitrary JavaScript. Any code passed to this tool executes in the WebView context with full access to the DOM andwindow.__TAURI__. This is equivalent to an XSS vulnerability if exposed in production. -
ipc_emit_eventcan trigger any app event. Events emitted through this tool are indistinguishable from real app events. In development this enables testing; in production it could trigger unintended state changes. -
webview_screenshotcaptures the entire WebView. If the app displays sensitive user data, screenshots will contain that data. Be mindful when screenshots are sent to AI services. -
Port range is predictable (9223–9322). Any process on the machine (or LAN, if bound to 0.0.0.0) can connect to the WebSocket server. There is no authentication on the WebSocket connection.
Troubleshooting Matrix
| # | Symptom | Root Cause | Fix |
|---|---|---|---|
| 1 | "No tools available" in AI client | MCP server not configured | Run npx -y install-mcp @hypothesi/tauri-mcp-server --client <your-client> |
| 2 | Tools listed but "connection refused" | Tauri app not running | Start app with cargo tauri dev |
| 3 | "WebSocket connection failed" | Plugin not registered in lib.rs | Add builder = builder.plugin(tauri_plugin_mcp_bridge::init()) inside #[cfg(debug_assertions)] block |
| 4 | "window.TAURI is undefined" | withGlobalTauri not enabled |
Add "withGlobalTauri": true in "app" section of tauri.conf.json |
| 5 | Screenshot returns blank/black (macOS) | Missing Screen Recording permission | Grant to terminal app in System Settings → Privacy → Screen Recording, then fully restart terminal |
| 6 | Screenshot returns blank (Linux) | Native screenshot not implemented | Expected — falls back to html2canvas-pro (JS-based). May not capture off-viewport elements |
| 7 | "Permission denied" errors from IPC tools | Missing capabilities | Add "mcp-bridge:default" to capabilities/default.json permissions array |
| 8 | "mcp-bridge:allow-all" not recognized | Using outdated permission string (pre-v0.7.0) | Replace with "mcp-bridge:default" |
| 9 | Tool names with tauri_ prefix not found |
Using outdated tool names (pre-v0.8.0) | Remove tauri_ prefix from tool names |
| 10 | Port 9223 already in use | Another Tauri app or process using the port | The plugin auto-scans ports 9223–9322; or use Builder::new().base_port(9323).build() |
| 11 | oklch() CSS causes screenshot failure |
Old html2canvas version (pre-v0.8.1) | Update to v0.8.1+ which uses html2canvas-pro |
| 12 | Window commands fail intermittently | Race condition on app startup (pre-v0.8.2) | Update to v0.8.2+ which adds exponential backoff retry |
| 13 | Android device not reachable | Missing ADB port forward | Run adb forward tcp:9223 tcp:9223 |
| 14 | "Module not found" when MCP server starts | Node.js < 20 | Upgrade to Node.js 20 or later |
| 15 | AI client shows stale tools after update | MCP server process cached | Fully quit and relaunch the AI client (not just window reload) |
| 16 | Multiple apps — wrong app targeted | Default app behavior uses most recent | Pass appIdentifier parameter to specify which app |
| 17 | webview_select_element times out |
User didn't click within timeout | Increase timeout parameter (max: 120000ms) or retry with /select |
| 18 | "ref-XX" selector not found | Stale ref IDs from old snapshot | Re-run webview_dom_snapshot to get fresh ref IDs, then retry |
| 19 | ipc_execute_command returns "command not found" |
Command name mismatch or not registered | Use ipc_get_backend_state to list registered commands; check for typos |
| 20 | webview_execute_js returns undefined |
Script didn't return a value | Wrap in IIFE: (function(){ ...; return result; })() |
| 21 | Connection drops after app hot-reload | WebSocket reconnection needed | Call driver_session({action: "start"}) again to reconnect |
| 22 | read_logs returns empty for Android/iOS |
Device not connected or wrong source | Run list_devices to verify; use source: "console" for WebView logs |
| 23 | Monorepo — cargo add fails |
Not in correct directory | cd into the src-tauri/ directory before running cargo add |
| 24 | Capabilities file not found | Non-standard project structure | Create src-tauri/capabilities/default.json manually with correct schema |