AWT — Eyes and Hands for Your AI Coding Tool

AWT (AI Watch Tester) gives your AI coding tool the ability to see and interact with web applications. Your AI designs the test strategy; AWT executes it with a real browser — clicking, typing, taking screenshots, and reporting back.

When to Use This Skill

User wants to test a web application (E2E, QA, regression)
User says "test it", "check if it works", "verify the login"
User needs to detect bugs after code changes
User wants automated regression testing
E2E test failed and user wants to find the root cause in source code

When NOT to Use

Unit testing or API-only testing (AWT is for UI/E2E)
Performance/load testing (use k6, Artillery)
Mobile native app testing (web and desktop only)

⛔ CRITICAL RULES — READ BEFORE DOING ANYTHING

NEVER use -y or --auto-approve — the user MUST approve before any test runs.
NEVER use aat devqa — it runs the entire pipeline without user checkpoints. Always use the 4-step workflow below.
NEVER run a test without showing the scenario to the user first and getting explicit approval (e.g. "진행해", "go ahead", "yes", "run it").
NEVER auto-fix code or scenarios without user permission — always report failures and ask the user what to do.
NEVER set headless: true — users MUST see the browser.
NEVER guess element names — always scan first and use real data from scan_result.json.

★ MANDATORY 4-STEP WORKFLOW ★

When the user asks to test anything, you MUST follow these 4 steps in order. Do NOT skip any step. Do NOT combine steps.

STEP 1: SCAN — Analyze the target site

aat scan --url <URL>

After scanning, read .aat/scan_result.json and present a summary to the user:

"I scanned http://localhost:3000 and found 83 interactive elements:
 - 5 input fields (email, password, search, ...)
 - 12 buttons (Login, Sign Up, Submit, ...)
 - 15 links (Home, About, Dashboard, ...)
 
 Should I create a test scenario based on these elements?"

⏸️ WAIT for user response before proceeding to Step 2.

STEP 2: GENERATE + PRESENT — Create scenario and show it

Write a YAML scenario using the exact data from scan_result.json. Then show the full scenario to the user:

"Here is the test scenario I created (5 steps):

 1. 🌐 Navigate to http://localhost:3000
 2. 🖱  Type 'test@example.com' into email field
 3. 🖱  Type 'password123' into password field  
 4. 🖱  Click 'Login' button (critical)
 5. ✅ Verify URL contains '/dashboard'

 Should I run this test? Or would you like me to modify anything?"

⏸️ WAIT for user approval before proceeding to Step 3.

If user says "modify X" → edit the scenario and show it again
If user says "go ahead" / "진행해" / "yes" → proceed to Step 3
If user says "cancel" → stop entirely

STEP 3: EXECUTE — Run the approved scenario

aat run --skill-mode --fast scenarios/<scenario>.yaml

Monitor the output. If any step fails:

STOP immediately — do not continue to the next scenario
Read the failure details from the AWT output
Read the SCREENSHOT if provided in the === AWT SKILL DEVQA === block
Report to the user:

"Test failed at Step 3 (Click Login button):
 - Error: Element 'Login' not found in DOM
 - Possible cause: The button text might be 'Sign In' instead of 'Login'
 - Screenshot: .aat/screenshots/step_003_fail.png
 
 Should I:
 (a) Fix the scenario (change 'Login' to 'Sign In') and re-test?
 (b) Fix the source code instead?
 (c) Skip this step and continue?"

⏸️ WAIT for user instruction before fixing or re-running.

STEP 4: REPORT — Summarize results

When all steps pass, report to the user:

"✅ Test complete: 5/5 steps passed (37.8 seconds)
 
 All navigation links and login flow are working correctly."

Auto-Setup

which aat || (pip install aat-devqa && playwright install chromium)

Scenario YAML Reference

Writing Scenarios from scan_result.json

Read .aat/scan_result.json and write YAML using the exact element data:

id: "SC-001"
name: "Login Flow"
steps:
  - step: 1
    action: navigate
    value: "http://localhost:3000/login"
    description: "Go to login page"

  # Use EXACT coordinates/selectors from scan_result.json
  - step: 2
    action: find_and_type
    target:
      selector: "#email"           # from scan_result.json
      text: "Email"
    value: "test@example.com"
    description: "Enter email"

  - step: 3
    action: find_and_click
    target:
      text: "Login"                # from scan_result.json
    region: main                   # exclude nav panel
    critical: true                 # stop if login fails
    description: "Click login button"

  - step: 4
    action: assert_url
    value: "/dashboard"
    on_fail: stop
    message: "Login failed — not redirected to dashboard"
    description: "Verify redirect to dashboard"

Rules:

Use selector from scan data when available (most reliable)
Use text as fallback (OCR-based)
Use region: main for all clicks (avoid nav panel False Positives)
Mark login/auth steps as critical: true
Add assert_url after navigation-triggering clicks

Actions (21 types)

Category	Actions
Navigation	`navigate`, `go_back`, `refresh`
Find + Mouse	`find_and_click`, `find_and_double_click`, `find_and_right_click`
Find + Keyboard	`find_and_type`, `find_and_clear`
Direct	`click_at`, `type_text` (supports `verify: true`), `press_key`, `key_combo`
Assert	`assert`, `assert_text`, `assert_screen_changed`, `assert_url`
Session	`save_session`, `load_session`
Utility	`wait`, `screenshot`, `scroll`

Key Step Options

- action: find_and_click
  target:
    text: "Submit"               # OCR text
    selector: "#submit-btn"      # CSS selector (highest priority)
  region: main                   # top/bottom/left/right/center/main/full
  critical: true                 # stop test on failure
  on_fail: stop                  # same as critical
  method: auto                   # auto/semantics/template/ocr/vision
  match_index: 0                 # 0=first match, -1=last
  change_threshold: 0.05         # for critical auto-verification
  description: "Click submit"

assert_url (login/navigation verification)

- action: assert_url
  value: "/dashboard"
  on_fail: stop
  message: "Login failed"
  description: "Verify redirect"

Session Reuse

# Save after login
- action: save_session
  name: "my_app_login"

# Load in next run (24h expiry)
- action: load_session
  name: "my_app_login"

Region Parameter

Region	Area	Use When
`full`	Entire screen (default)	General use
`main`	Right 80%	Always use for clicks (avoids nav panel)
`top`	Top 30%	Header elements
`bottom`	Bottom 30%	Footer, floating buttons
`center`	Central 60%x60%	Modal dialogs

Flutter CanvasKit Support

AWT automatically detects Flutter CanvasKit and activates Semantics:

After navigate, clicks flt-semantics-placeholder (3 retries, 3s each)
Reads flt-semantics[aria-label] for element coordinates
Falls back to OCR if Semantics unavailable

Matching priority on Flutter: CSS selector → Flutter Semantics → Playwright text → OCR → Vision AI

Flutter-specific rules:

Always use region: main (Canvas OCR picks up nav text)
Use verify: true on type_text (Canvas input may not render)
Add assert_screen_changed after clicks
Use method: semantics to force Semantics lookup

Source Code Root Cause Analysis

When a test fails, trace to the source code:

Read test output — which step, what error, what URL
Search codebase — grep for the expected text, URL route, component
Read the component — find why it doesn't render/redirect/respond
Propose a fix to the user — show a concrete diff, ask for approval
After user approves — apply fix, re-build, re-scan, re-test

Key principle: NEVER auto-apply fixes. Always show the diff and ask.

CLI Commands

Command	Description
`aat scan --url URL`	Scan page, collect elements to scan_result.json
`aat run --skill-mode PATH`	Execute with structured output for AI
`aat run --skill-mode --fast PATH`	Execute in fast DOM-only mode (Next.js, React, etc.)
`aat run --debug PATH`	Execute with OCR candidate debug logs
`aat doctor`	Check environment
`aat setup`	Configure AI + Vision providers
`aat validate PATH`	Validate YAML scenarios
`aat cost`	View AI API costs

Key Flags

--skill-mode    Structured output for AI assistants
--fast          DOM-only matching (skip Vision/OCR — fastest for standard web apps)
--debug         Show OCR candidates and matcher details
--strict        Treat skipped steps as failures
--learn         Record healed steps for pattern learning
--slow-mo N     Slow down actions by N ms

⛔ Banned Flags (NEVER use these)

-y / --auto-approve    Bypasses user approval — NEVER USE

Best Practices

Always scan before writing scenarios — never guess element names
Use region: main on all find_and_click — prevents nav panel False Positives
Mark auth steps critical: true — no point testing after login fails
Add assert_url after form submits — verify navigation happened
Use save_session/load_session — skip login on repeated runs
Read screenshots on failure — the SCREENSHOT path is a real PNG file
One fix per retry — change only the failing step
Always ask user before fixing — never auto-modify code or scenarios

AI Providers (for Vision AI Tier 3)

Provider	Vision	Cost	Setup
Gemini Flash	Yes	Free	`aat setup` → Gemini
Claude	Yes	Medium	`aat setup` → Claude
GPT-4o	Yes	Higher	`aat setup` → OpenAI

Vision AI is optional — Tier 1 (template) + Tier 2 (OCR) are free and work without API keys.

Reference Files

references/scenario-schema.md — Full YAML schema
references/cli-reference.md — Complete CLI reference
references/config-reference.md — All configuration options
templates/scenario-template.yaml — Blank scenario template
templates/config-template.yaml — Default config template

awt