Auto Mode

--auto skips interactive confirmation of test plan. --dry-run extracts scenarios only without execution.

Business Test (PRD-Forward)

Usage

$quality-business-test "3"                          # test phase 3 against PRD
$quality-business-test "3 --layer L1"               # L1 interface tests only
$quality-business-test "3 --gen-code"               # generate framework-specific test classes
$quality-business-test "3 --dry-run"                # extract scenarios only, don't execute
$quality-business-test "3 --re-run"                 # re-run only previously failed scenarios
$quality-business-test "3 --spec SPEC-auth-2026-04" # explicit spec reference
$quality-business-test "3 --auto"                   # skip plan confirmation

Flags:

<phase>: Phase number (required)
--spec SPEC-xxx: Explicit spec package reference (default: auto-detect from index.json)
--layer L1|L2|L3: Run only specific layer
--gen-code: Generate framework-specific test classes (JUnit/RestAssured, supertest/vitest, pytest/httpx)
--dry-run: Extract scenarios and fixtures only, don't execute
--re-run: Re-run only previously failed/blocked scenarios
--auto: Skip interactive confirmations

Output: {phase_dir}/.tests/business/business-test-plan.json + business-test-report.json + business-test-summary.md

Overview

Validate built features against PRD acceptance criteria through automated multi-layer business testing. Unlike quality-test (interactive UAT from code gaps) and quality-test-gen (generate tests from coverage gaps), this starts from REQ-*.md acceptance criteria and works forward.

Three-track testing (complementary, not replacements):

Command	Input Source	Verification Angle
`quality-business-test`	REQ-*.md acceptance criteria	PRD-forward — are business rules satisfied?
`quality-test`	verification.json must_haves	Code-backward — does the code work?
`quality-test-gen`	validation.json gaps	Coverage-backward — is coverage sufficient?

Layer definitions:

Layer	Name	Tests	Source
L1	Interface Contract	Single endpoint request/response, input validation, schema compliance	Architecture API endpoints + REQ AC
L2	Business Rule	Multi-step logic, state transitions, business constraints, edge cases	REQ acceptance criteria + NFR
L3	Business Scenario	Full user flows, multi-service chains, error propagation	Epic user stories

Implementation

Step 1: Resolve Target & Load Spec Package

Parse $ARGUMENTS for phase number and flags
Set PHASE_DIR = .workflow/phases/{NN}-{slug}/
Load index.json -> find spec_ref -> locate .workflow/.spec/SPEC-xxx/
Full mode: Read requirements/_index.md + all REQ-*.md + NFR-*.md + architecture/_index.md + epics/EPIC-*.md
Degraded mode (no spec package): Read index.json.success_criteria + plan.json convergence criteria + .summaries/TASK-*.md
If --re-run: load previous business-test-report.json, filter to failed/blocked scenarios

Step 2: Extract Business Test Scenarios from PRD

For each REQ-NNN-{slug}.md:

Parse ## Acceptance Criteria section
Map RFC 2119 keywords to priority:

Keyword	Priority	Failure =
MUST / SHALL	critical	blocker
SHOULD / RECOMMENDED	high	major
MAY / OPTIONAL	medium	minor

Classify scenario into layer:

Source	Layer	Category
Architecture API endpoints + REQ AC about request/response	L1	api_contract
REQ AC about business logic, validation, state changes	L2	business_rule
Architecture state machine transitions	L2	state_transition
Epic user stories (multi-step flows)	L3	user_flow
NFR performance/security constraints	L2	non_functional

Generate scenario JSON with id, req_ref (REQ-NNN:AC-N), layer, priority, name, category, endpoint, input, expected, preconditions, postconditions, mock_services

Degraded mode: Extract from success_criteria (each -> L2 scenario), plan.json convergence criteria (each -> L1/L2), all default priority: high. No L3 in degraded mode.

Step 3: Generate Test Data (Fixtures)

Three tiers:

Tier 1 — Schema-derived: From REQ data models, generate valid/invalid/boundary variants per entity:

valid: satisfies all constraints
invalid: violate each constraint individually (null, empty, overflow, wrong type)
boundary: edge values (min, max, min-1, max+1)

Tier 2 — Criteria-derived: From "MUST return X when Y" -> { input: Y, expected: X }. From "MUST validate Z" -> { input: invalid_Z, expected: error }.

Tier 3 — Scenario-derived (L3 only): From Epic user stories -> scenario packs with coordinated entity IDs across steps.

Microservice mocks: From architecture API contract -> request/response pairs for WireMock stubs.

Step 4: Write Test Plan & Confirm

Archive previous business-test-plan.json to .history/ if exists
Write .tests/business/business-test-plan.json with scenarios, fixtures, mock_contracts, requirement_coverage_plan
Display plan summary (scenario counts per layer, fixture counts, requirement coverage)
If not --auto: wait for user confirmation (yes/edit/cancel)
If --dry-run: stop here, report plan

Step 5: Generate Test Code (if --gen-code)

Detect project tech stack from .workflow/specs/project-tech.json or codebase scan.

Stack	L1	L2	L3
Java/Spring Boot	RestAssured + MockMvc	JUnit 5 Parameterized + WireMock	TestContainers
TypeScript/Node	supertest + vitest	vitest + nock	playwright/cypress
Python	httpx + pytest	pytest + responses	pytest + selenium

Each test method includes REQ-NNN:AC-N reference in display name. Test files placed in .tests/business/{layer}/.

If no --gen-code: scenarios stay as structured JSON for AI agent execution.

Step 6: Execute Tests (Progressive L1 → L2 → L3)

Fail-fast: L1 critical failures -> STOP (don't run L2). L2 critical failures -> STOP (don't run L3).

Generator-Critic loop per layer (max 3 iterations):

Iteration	Action
1	Run all scenarios. Critic: classify failures as test_defect / code_defect / env_issue
2	Auto-fix test_defects, re-run ALL scenarios
3	Final confirmation. Remaining failures = confirmed code_defects

Execution modes:

--gen-code: run via test framework (mvn test, npx vitest, etc.)
default: AI agent executes scenarios against running application

Record results in .tests/business/test-results-iter-{N}.json.

Step 7: Build Traceability Matrix

Map each result to REQ-NNN:AC-N:

FOR each REQ:
  FOR each AC:
    ac_status = "passed" if ALL scenarios passed
                "failed" if ANY failed
                "blocked" if ANY blocked (none failed)
                "untested" if no scenarios mapped
  verdict = "verified" if all MUST+SHOULD passed
            "partial" if some failed
            "unverified" if all failed/untested

Step 8: Generate Reports

Archive previous report/summary to .history/
Write .tests/business/business-test-report.json with:
- layers: per-layer stats (total, passed, failed, blocked, pass_rate)
- requirement_coverage: per-REQ criteria results with failure details
- failures: each with req_ref, severity, expected/actual, fix_suggestion
- summary: total_requirements, fully_verified, partially_verified, unverified, coverage_pct
Write .tests/business/business-test-summary.md (human-readable tables)
Update index.json with business_test section

Step 9: Feedback Loop

Auto-create issues from failures in .workflow/issues/issues.jsonl (each with req_ref, source: "business-test")
Report results
Route next step:

Result	Suggestion
All requirements verified	Skill({ skill: "maestro-phase-transition", args: "{phase}" })
Failures found	Skill({ skill: "quality-debug", args: "--from-business-test {phase}" })
`--re-run` all pass	Skill({ skill: "maestro-verify", args: "{phase}" })
Low coverage (< 60%)	Skill({ skill: "quality-test-gen", args: "{phase}" })

Closure criteria: Requirement marked "verified" ONLY when ALL MUST+SHOULD acceptance criteria pass.

Error Handling

Code	Severity	Condition	Recovery
E001	error	Phase number required	Prompt user for phase number
E002	error	Phase directory not found	Verify phase exists in .workflow/phases/
E003	error	No spec package AND no success_criteria	Run maestro-spec-generate or maestro-plan first
E004	error	L1 critical failures block L2/L3	Fix blockers via quality-debug
W001	warning	Degraded mode (no spec package)	Consider running maestro-spec-generate
W002	warning	Some REQs have no testable AC	Note in report
W003	warning	Generator-Critic loop exhausted	Accept current state
W004	warning	Mock services unavailable for L3	Skip L3 or use --gen-code

Core Rules

PRD is source of truth -- business rules drive test scenarios, not code structure
RFC 2119 keyword priority -- MUST = critical, SHOULD = high, MAY = medium
Fail-fast across layers -- critical L1 failures block L2/L3
Generator-Critic loop max 3 iterations per layer
Traceability on every result -- every pass/fail maps to REQ-NNN:AC-N
Agent calls use run_in_background: false for synchronous execution
Auto-create issues in .workflow/issues/issues.jsonl for every failure
Degraded mode works without spec package (from success_criteria + plan.json)
Never modify source code -- this command tests, it doesn't fix

quality-business-test

Auto Mode

Business Test (PRD-Forward)

Usage

Overview

Implementation

Step 1: Resolve Target & Load Spec Package

Step 2: Extract Business Test Scenarios from PRD

Step 3: Generate Test Data (Fixtures)

Step 4: Write Test Plan & Confirm

Step 5: Generate Test Code (if --gen-code)

Step 6: Execute Tests (Progressive L1 → L2 → L3)

Step 7: Build Traceability Matrix

Step 8: Generate Reports

Step 9: Feedback Loop

Error Handling

Core Rules

More from catlog22/maestro-flow

spec-map

manage-codebase-rebuild

maestro-quick

quality-sync

maestro-roadmap

manage-memory