awt
AWT — Eyes and Hands for Your AI Coding Tool
AWT (AI Watch Tester) gives your AI coding tool the ability to see and interact with web applications. Your AI designs the test strategy; AWT executes it with a real browser — clicking, typing, taking screenshots, and reporting back.
When to Use This Skill
- User wants to test a web application (E2E, QA, regression)
- User says "test it", "check if it works", "verify the login"
- User needs to detect bugs after code changes
- User wants automated regression testing
- E2E test failed and user wants to find the root cause in source code
When NOT to Use
- Unit testing or API-only testing (AWT is for UI/E2E)
- Performance/load testing (use k6, Artillery)
- Mobile native app testing (web and desktop only)
⛔ CRITICAL RULES — READ BEFORE DOING ANYTHING
- NEVER use
-yor--auto-approve— the user MUST approve before any test runs. - NEVER use
aat devqa— it runs the entire pipeline without user checkpoints. Always use the 4-step workflow below. - NEVER run a test without showing the scenario to the user first and getting explicit approval (e.g. "진행해", "go ahead", "yes", "run it").
- NEVER auto-fix code or scenarios without user permission — always report failures and ask the user what to do.
- NEVER set
headless: true— users MUST see the browser. - NEVER guess element names — always scan first and use real data from
scan_result.json.
★ MANDATORY 4-STEP WORKFLOW ★
When the user asks to test anything, you MUST follow these 4 steps in order. Do NOT skip any step. Do NOT combine steps.
STEP 1: SCAN — Analyze the target site
aat scan --url <URL>
After scanning, read .aat/scan_result.json and present a summary to the user:
"I scanned http://localhost:3000 and found 83 interactive elements:
- 5 input fields (email, password, search, ...)
- 12 buttons (Login, Sign Up, Submit, ...)
- 15 links (Home, About, Dashboard, ...)
Should I create a test scenario based on these elements?"
⏸️ WAIT for user response before proceeding to Step 2.
STEP 2: GENERATE + PRESENT — Create scenario and show it
Write a YAML scenario using the exact data from scan_result.json. Then show the full scenario to the user:
"Here is the test scenario I created (5 steps):
1. 🌐 Navigate to http://localhost:3000
2. 🖱 Type 'test@example.com' into email field
3. 🖱 Type 'password123' into password field
4. 🖱 Click 'Login' button (critical)
5. ✅ Verify URL contains '/dashboard'
Should I run this test? Or would you like me to modify anything?"
⏸️ WAIT for user approval before proceeding to Step 3.
- If user says "modify X" → edit the scenario and show it again
- If user says "go ahead" / "진행해" / "yes" → proceed to Step 3
- If user says "cancel" → stop entirely
STEP 3: EXECUTE — Run the approved scenario
aat run --skill-mode --fast scenarios/<scenario>.yaml
Monitor the output. If any step fails:
- STOP immediately — do not continue to the next scenario
- Read the failure details from the AWT output
- Read the SCREENSHOT if provided in the
=== AWT SKILL DEVQA ===block - Report to the user:
"Test failed at Step 3 (Click Login button):
- Error: Element 'Login' not found in DOM
- Possible cause: The button text might be 'Sign In' instead of 'Login'
- Screenshot: .aat/screenshots/step_003_fail.png
Should I:
(a) Fix the scenario (change 'Login' to 'Sign In') and re-test?
(b) Fix the source code instead?
(c) Skip this step and continue?"
⏸️ WAIT for user instruction before fixing or re-running.
STEP 4: REPORT — Summarize results
When all steps pass, report to the user:
"✅ Test complete: 5/5 steps passed (37.8 seconds)
All navigation links and login flow are working correctly."
Auto-Setup
which aat || (pip install aat-devqa && playwright install chromium)
Scenario YAML Reference
Writing Scenarios from scan_result.json
Read .aat/scan_result.json and write YAML using the exact element data:
id: "SC-001"
name: "Login Flow"
steps:
- step: 1
action: navigate
value: "http://localhost:3000/login"
description: "Go to login page"
# Use EXACT coordinates/selectors from scan_result.json
- step: 2
action: find_and_type
target:
selector: "#email" # from scan_result.json
text: "Email"
value: "test@example.com"
description: "Enter email"
- step: 3
action: find_and_click
target:
text: "Login" # from scan_result.json
region: main # exclude nav panel
critical: true # stop if login fails
description: "Click login button"
- step: 4
action: assert_url
value: "/dashboard"
on_fail: stop
message: "Login failed — not redirected to dashboard"
description: "Verify redirect to dashboard"
Rules:
- Use
selectorfrom scan data when available (most reliable) - Use
textas fallback (OCR-based) - Use
region: mainfor all clicks (avoid nav panel False Positives) - Mark login/auth steps as
critical: true - Add
assert_urlafter navigation-triggering clicks
Actions (21 types)
| Category | Actions |
|---|---|
| Navigation | navigate, go_back, refresh |
| Find + Mouse | find_and_click, find_and_double_click, find_and_right_click |
| Find + Keyboard | find_and_type, find_and_clear |
| Direct | click_at, type_text (supports verify: true), press_key, key_combo |
| Assert | assert, assert_text, assert_screen_changed, assert_url |
| Session | save_session, load_session |
| Utility | wait, screenshot, scroll |
Key Step Options
- action: find_and_click
target:
text: "Submit" # OCR text
selector: "#submit-btn" # CSS selector (highest priority)
region: main # top/bottom/left/right/center/main/full
critical: true # stop test on failure
on_fail: stop # same as critical
method: auto # auto/semantics/template/ocr/vision
match_index: 0 # 0=first match, -1=last
change_threshold: 0.05 # for critical auto-verification
description: "Click submit"
assert_url (login/navigation verification)
- action: assert_url
value: "/dashboard"
on_fail: stop
message: "Login failed"
description: "Verify redirect"
Session Reuse
# Save after login
- action: save_session
name: "my_app_login"
# Load in next run (24h expiry)
- action: load_session
name: "my_app_login"
Region Parameter
| Region | Area | Use When |
|---|---|---|
full |
Entire screen (default) | General use |
main |
Right 80% | Always use for clicks (avoids nav panel) |
top |
Top 30% | Header elements |
bottom |
Bottom 30% | Footer, floating buttons |
center |
Central 60%x60% | Modal dialogs |
Flutter CanvasKit Support
AWT automatically detects Flutter CanvasKit and activates Semantics:
- After
navigate, clicksflt-semantics-placeholder(3 retries, 3s each) - Reads
flt-semantics[aria-label]for element coordinates - Falls back to OCR if Semantics unavailable
Matching priority on Flutter: CSS selector → Flutter Semantics → Playwright text → OCR → Vision AI
Flutter-specific rules:
- Always use
region: main(Canvas OCR picks up nav text) - Use
verify: trueontype_text(Canvas input may not render) - Add
assert_screen_changedafter clicks - Use
method: semanticsto force Semantics lookup
Source Code Root Cause Analysis
When a test fails, trace to the source code:
- Read test output — which step, what error, what URL
- Search codebase — grep for the expected text, URL route, component
- Read the component — find why it doesn't render/redirect/respond
- Propose a fix to the user — show a concrete diff, ask for approval
- After user approves — apply fix, re-build, re-scan, re-test
Key principle: NEVER auto-apply fixes. Always show the diff and ask.
CLI Commands
| Command | Description |
|---|---|
aat scan --url URL |
Scan page, collect elements to scan_result.json |
aat run --skill-mode PATH |
Execute with structured output for AI |
aat run --skill-mode --fast PATH |
Execute in fast DOM-only mode (Next.js, React, etc.) |
aat run --debug PATH |
Execute with OCR candidate debug logs |
aat doctor |
Check environment |
aat setup |
Configure AI + Vision providers |
aat validate PATH |
Validate YAML scenarios |
aat cost |
View AI API costs |
Key Flags
--skill-mode Structured output for AI assistants
--fast DOM-only matching (skip Vision/OCR — fastest for standard web apps)
--debug Show OCR candidates and matcher details
--strict Treat skipped steps as failures
--learn Record healed steps for pattern learning
--slow-mo N Slow down actions by N ms
⛔ Banned Flags (NEVER use these)
-y / --auto-approve Bypasses user approval — NEVER USE
Best Practices
- Always scan before writing scenarios — never guess element names
- Use
region: mainon allfind_and_click— prevents nav panel False Positives - Mark auth steps
critical: true— no point testing after login fails - Add
assert_urlafter form submits — verify navigation happened - Use
save_session/load_session— skip login on repeated runs - Read screenshots on failure — the SCREENSHOT path is a real PNG file
- One fix per retry — change only the failing step
- Always ask user before fixing — never auto-modify code or scenarios
AI Providers (for Vision AI Tier 3)
| Provider | Vision | Cost | Setup |
|---|---|---|---|
| Gemini Flash | Yes | Free | aat setup → Gemini |
| Claude | Yes | Medium | aat setup → Claude |
| GPT-4o | Yes | Higher | aat setup → OpenAI |
Vision AI is optional — Tier 1 (template) + Tier 2 (OCR) are free and work without API keys.
Reference Files
references/scenario-schema.md— Full YAML schemareferences/cli-reference.md— Complete CLI referencereferences/config-reference.md— All configuration optionstemplates/scenario-template.yaml— Blank scenario templatetemplates/config-template.yaml— Default config template