verify-tests-fail-without-fix
Verify Tests Fail Without Fix
Verifies UI tests actually catch the issue. Supports two workflow modes:
Mode 1: Verify Failure Only (Test Creation)
Use when creating tests before writing a fix:
- Runs tests to verify they FAIL (proving they catch the bug)
- No fix files required
- Perfect for test-first development
# Auto-detect test filter from changed test files
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform android
# With explicit test filter
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform ios -TestFilter "Issue33356"
Mode 2: Full Verification (Fix Validation)
Use when validating both tests and fix:
- Without fix - tests should FAIL (bug is present)
- With fix - tests should PASS (bug is fixed)
# Auto-detect everything (recommended)
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform android -RequireFullVerification
# With explicit test filter
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform ios -TestFilter "Issue33356" -RequireFullVerification
Note: -RequireFullVerification ensures the script errors if no fix files are detected, preventing silent fallback to failure-only mode.
Requirements
Verify Failure Only Mode:
- Test files in the PR (or working directory)
Full Verification Mode:
- Test files in the PR
- Fix files in the PR (non-test code changes)
The script auto-detects which mode to use based on whether fix files are present.
Expected Output
Verify Failure Only Mode:
╔═══════════════════════════════════════════════════════════╗
║ VERIFICATION PASSED ✅ ║
╠═══════════════════════════════════════════════════════════╣
║ Tests FAILED as expected! ║
║ This proves the tests correctly reproduce the bug. ║
╚═══════════════════════════════════════════════════════════╝
Full Verification Mode:
╔═══════════════════════════════════════════════════════════╗
║ VERIFICATION PASSED ✅ ║
╠═══════════════════════════════════════════════════════════╣
║ - FAIL without fix (as expected) ║
║ - PASS with fix (as expected) ║
╚═══════════════════════════════════════════════════════════╝
What It Does
Verify Failure Only Mode (no fix files):
- Fetches base branch from origin (if available)
- Auto-detects test classes from changed test files
- Runs tests (should FAIL to prove they catch the bug)
- Updates PR labels based on result
- Reports result
Full Verification Mode (fix files detected):
- Fetches base branch from origin to ensure accurate diff
- Auto-detects fix files (non-test code) from git diff
- Auto-detects test classes from
TestCases.Shared.Tests/*.cs - Reverts fix files to base branch
- Runs tests (should FAIL without fix)
- Restores fix files
- Runs tests (should PASS with fix)
- Generates markdown reports:
CustomAgentLogsTmp/TestValidation/verification-report.md- Full detailed reportCustomAgentLogsTmp/PRState/verification-report.md- Gate section for PR agent
- Updates PR labels based on result
- Reports result
PR Labels
The skill automatically manages two labels on the PR to indicate verification status:
| Label | Color | When Applied |
|---|---|---|
s/ai-reproduction-confirmed |
🟢 Green (#2E7D32) | Tests correctly FAIL without fix (AI verified tests catch the bug) |
s/ai-reproduction-failed |
🟠 Orange (#E65100) | Tests PASS without fix (AI verified tests don't catch the bug) |
Behavior:
- When verification passes, adds
s/ai-reproduction-confirmedand removess/ai-reproduction-failedif present - When verification fails, adds
s/ai-reproduction-failedand removess/ai-reproduction-confirmedif present - If a PR is re-verified after fixing tests, labels are updated accordingly
- No label = AI hasn't verified tests yet
Output Files
The skill generates output files under CustomAgentLogsTmp/PRState/<PRNumber>/verify-tests-fail/:
| File | Description |
|---|---|
verification-report.md |
Comprehensive markdown report with test results and full logs |
verification-log.txt |
Text log of the verification process |
test-without-fix.log |
Full test output from run without fix |
test-with-fix.log |
Full test output from run with fix |
Plus UI test logs in CustomAgentLogsTmp/UITests/:
android-device.logorios-device.log- Device logstest-output.log- NUnit test output
Example structure:
CustomAgentLogsTmp/
├── UITests/ # Shared UI test logs
│ ├── android-device.log
│ └── test-output.log
└── PRState/
└── 27847/
└── verify-tests-fail/
├── verification-report.md # Full detailed report
├── verification-log.txt
├── test-without-fix.log
└── test-with-fix.log
PR Number Detection:
- Auto-detected from branch name (e.g.,
pr-27847) - Falls back to
gh pr viewcommand - Uses "unknown" if detection fails
- Can be manually specified with
-PRNumberparameter
Troubleshooting
| Problem | Cause | Solution |
|---|---|---|
| No fix files detected | Base branch detection failed or no non-test files changed | Use -FixFiles or -BaseBranch explicitly |
| Tests pass without fix | Tests don't detect the bug | Review test assertions, update test |
| Tests fail with fix | Fix doesn't work or test is wrong | Review fix implementation |
| App crashes | Duplicate issue numbers, XAML error | Check device logs |
| Element not found | Wrong AutomationId, app crashed | Verify IDs match |
Optional Parameters
# Require full verification (fail if no fix files detected) - recommended
-RequireFullVerification
# Explicit test filter
-TestFilter "Issue32030|ButtonUITests"
# Explicit fix files
-FixFiles @("src/Core/src/File.cs")
# Explicit base branch
-BaseBranch "main"