E2E Outside-In Test Generator

Automatically generates comprehensive end-to-end Playwright tests for full-stack applications using outside-in testing methodology. Generates 40+ tests across 7 categories, finds real bugs, and validates through iterative fix loops.

LEVEL 1: Quick Start

Purpose

This skill analyzes your full-stack application and generates a complete Playwright test suite that:

Validates user journeys from browser through backend
Finds real bugs through systematic exploration
Runs deterministically with no flakiness
Integrates with CI out of the box

When to Use

Activate this skill when you:

Need comprehensive browser testing for a full-stack app
Want to validate critical user flows end-to-end
Need regression coverage before major refactoring
Are preparing for production deployment

Requirements: Your project must have both a frontend (Next.js, React, Vue, Angular) and backend API.

Quick Start

# In your project root
$ claude

> add e2e tests

The skill automatically:

Analyzes your application (routes, API endpoints, database schema)
Sets up infrastructure (Playwright config, test helpers, seed data)
Generates 40+ tests across 7 categories
Runs tests and fixes failures (up to 5 iterations)
Reports coverage and recommendations

Expected Output

e2e/
├── playwright.config.ts          # Playwright configuration
├── test-helpers/
│   ├── auth.ts                   # Authentication helpers
│   ├── navigation.ts             # Navigation helpers
│   ├── assertions.ts             # Custom assertions
│   └── data-setup.ts             # Test data management
├── fixtures/
│   ├── users.json                # Test user data
│   ├── products.json             # Test product data
│   └── seed.sql                  # Database seed script
├── happy-path/
│   ├── user-registration.spec.ts
│   ├── user-login.spec.ts
│   └── checkout-flow.spec.ts
├── edge-cases/
│   ├── invalid-inputs.spec.ts
│   └── boundary-conditions.spec.ts
├── error-handling/
│   ├── network-failures.spec.ts
│   └── validation-errors.spec.ts
├── performance/
│   ├── page-load-times.spec.ts
│   └── api-response-times.spec.ts
├── security/
│   ├── unauthorized-access.spec.ts
│   └── xss-protection.spec.ts
├── accessibility/
│   ├── keyboard-navigation.spec.ts
│   └── screen-reader.spec.ts
└── integration/
    ├── database-persistence.spec.ts
    └── api-integration.spec.ts

Total: 42 tests across 7 categories

Run Your Tests

# Run all tests
npx playwright test

# Run specific category
npx playwright test e2e/happy-path

# Run in headed mode
npx playwright test --headed

# Run in debug mode
npx playwright test --debug

Success Criteria

After generation, your test suite achieves:

✓ 40+ tests across all 7 categories
✓ 100% pass rate after fix loop
✓ <2 minute total execution time
✓ ≥1 real bug discovered during generation
✓ Zero flakiness (deterministic test data)

LEVEL 2: Full Features

The 5 Phases

graph LR
    A[Phase 1: Analysis] --> B[Phase 2: Infrastructure]
    B --> C[Phase 3: Generation]
    C --> D[Phase 4: Fix Loop]
    D --> E[Phase 5: Coverage Audit]

Phase 1: Stack Analysis

The skill performs deep application analysis:

Frontend Analysis:

Detects framework (Next.js, React, Vue, Angular)
Maps routes and pages
Identifies navigation patterns
Extracts interactive elements

Backend Analysis:

Discovers API endpoints (REST/GraphQL)
Maps data models and relationships
Identifies authentication mechanisms
Detects validation rules

Database Analysis:

Extracts schema and relationships
Identifies required fields
Determines foreign key constraints
Maps enum types

Example Analysis Output:

StackConfig(
    frontend_framework="nextjs",
    frontend_dir="app/",
    backend_framework="fastapi",
    api_base_url="http://localhost:8000/api",
    database_type="postgresql",
    auth_mechanism="jwt",
    routes=[
        Route(path="/", component="Home"),
        Route(path="/login", component="Login"),
        Route(path="/products", component="ProductList"),
        Route(path="/products/:id", component="ProductDetail"),
        Route(path="/checkout", component="Checkout")
    ],
    api_endpoints=[
        APIEndpoint(path="/auth/login", method="POST"),
        APIEndpoint(path="/auth/register", method="POST"),
        APIEndpoint(path="/products", method="GET"),
        APIEndpoint(path="/products/:id", method="GET"),
        APIEndpoint(path="/orders", method="POST")
    ],
    models=[
        Model(name="User", fields=["id", "email", "password"]),
        Model(name="Product", fields=["id", "name", "price", "stock"]),
        Model(name="Order", fields=["id", "user_id", "product_id", "quantity"])
    ]
)

Phase 2: Infrastructure Setup

Generates complete testing infrastructure:

Playwright Configuration:

// e2e/playwright.config.ts
import { defineConfig, devices } from "@playwright/test";

export default defineConfig({
  testDir: "./e2e",
  fullyParallel: false, // CRITICAL: workers=1 for deterministic test data
  forbidOnly: !!process.env.CI,
  retries: process.env.CI ? 2 : 0,
  workers: 1, // NEVER > 1 to prevent data races
  reporter: "html",
  use: {
    baseURL: "http://localhost:3000",
    trace: "on-first-retry",
  },
  projects: [{ name: "chromium", use: { ...devices["Desktop Chrome"] } }],
  webServer: {
    command: "npm run dev",
    url: "http://localhost:3000",
    reuseExistingServer: !process.env.CI,
  },
});

Test Helpers:

// e2e/test-helpers/auth.ts
export async function login(page: Page, email: string, password: string) {
  await page.goto("/login");
  await page.getByRole("textbox", { name: /email/i }).fill(email);
  await page.getByRole("textbox", { name: /password/i }).fill(password);
  await page.getByRole("button", { name: /sign in/i }).click();
  await page.waitForURL("/dashboard");
}

export async function logout(page: Page) {
  await page.getByRole("button", { name: /logout/i }).click();
  await page.waitForURL("/");
}

Seed Data:

// e2e/fixtures/users.json
[
  {
    "email": "test@example.com",
    "password": "Test123!", // pragma: allowlist secret
    "name": "Test User",
    "role": "customer"
  },
  {
    "email": "admin@example.com",
    "password": "Admin123!", // pragma: allowlist secret
    "name": "Admin User",
    "role": "admin"
  }
]

Phase 3: Test Generation

Generates tests across 7 mandatory categories:

1. Happy Path Tests (Critical user journeys)

// e2e/happy-path/user-registration.spec.ts
import { test, expect } from "@playwright/test";

test("user can register with valid credentials", async ({ page }) => {
  await page.goto("/register");

  await page.getByRole("textbox", { name: /email/i }).fill("newuser@example.com");
  await page.getByRole("textbox", { name: /password/i }).fill("Password123!");
  await page.getByRole("textbox", { name: /confirm password/i }).fill("Password123!");
  await page.getByRole("button", { name: /create account/i }).click();

  await expect(page).toHaveURL("/dashboard");
  await expect(page.getByText(/welcome/i)).toBeVisible();
});

2. Edge Case Tests (Boundary conditions)

// e2e/edge-cases/boundary-conditions.spec.ts
test("rejects password that is too short", async ({ page }) => {
  await page.goto("/register");

  await page.getByRole("textbox", { name: /email/i }).fill("test@example.com");
  await page.getByRole("textbox", { name: /password/i }).fill("123"); // Too short
  await page.getByRole("button", { name: /create account/i }).click();

  await expect(page.getByText(/password must be at least 8 characters/i)).toBeVisible();
});

3. Error Handling Tests (Failure scenarios)

// e2e/error-handling/network-failures.spec.ts
test("shows error when API is unavailable", async ({ page, context }) => {
  // Simulate network failure
  await context.route("**/api/**", (route) => route.abort());

  await page.goto("/products");

  await expect(page.getByText(/unable to load products/i)).toBeVisible();
  await expect(page.getByRole("button", { name: /retry/i })).toBeVisible();
});

4. Performance Tests (Speed validation)

// e2e/performance/page-load-times.spec.ts
test("homepage loads in under 2 seconds", async ({ page }) => {
  const startTime = Date.now();
  await page.goto("/");
  await page.waitForLoadState("networkidle");
  const loadTime = Date.now() - startTime;

  expect(loadTime).toBeLessThan(2000);
});

5. Security Tests (Authorization/XSS)

// e2e/security/unauthorized-access.spec.ts
test("redirects unauthenticated users from protected routes", async ({ page }) => {
  await page.goto("/dashboard");

  await expect(page).toHaveURL("/login");
  await expect(page.getByText(/please sign in/i)).toBeVisible();
});

6. Accessibility Tests (WCAG compliance)

// e2e/accessibility/keyboard-navigation.spec.ts
import { test, expect } from "@playwright/test";

test("login form is fully keyboard accessible", async ({ page }) => {
  await page.goto("/login");

  await page.keyboard.press("Tab"); // Focus email
  await page.keyboard.type("test@example.com");
  await page.keyboard.press("Tab"); // Focus password
  await page.keyboard.type("Test123!");
  await page.keyboard.press("Tab"); // Focus submit button
  await page.keyboard.press("Enter"); // Submit form

  await expect(page).toHaveURL("/dashboard");
});

7. Integration Tests (Database/API)

// e2e/integration/database-persistence.spec.ts
test("product order persists across sessions", async ({ page, context }) => {
  // Create order
  await login(page, "test@example.com", "Test123!");
  await page.goto("/products/123");
  await page.getByRole("button", { name: /add to cart/i }).click();
  await page.goto("/checkout");
  await page.getByRole("button", { name: /place order/i }).click();

  const orderNumber = await page.getByText(/order #\d+/).textContent();

  // Logout and login again
  await logout(page);
  await login(page, "test@example.com", "Test123!");

  // Verify order exists
  await page.goto("/orders");
  await expect(page.getByText(orderNumber!)).toBeVisible();
});

Phase 4: Fix Loop

Automatically fixes failing tests through iterative debugging:

Fix Loop Process:

Run tests → Collect failures
Analyze failures → Categorize issues (locator, timing, data, logic)
Apply fixes → Update tests based on issue type
Rerun tests → Verify fixes
Repeat → Max 5 iterations

Common Fix Patterns:

// Before: Flaky locator
await page.click(".submit-button");

// After: Role-based locator
await page.getByRole("button", { name: /submit/i }).click();

// Before: Race condition
await page.click("#login");
await page.fill("#email", "test@example.com");

// After: Wait for navigation
await page.click("#login");
await page.waitForLoadState("networkidle");
await page.fill("#email", "test@example.com");

// Before: Hardcoded data
await page.fill("#quantity", "5");

// After: Dynamic data from fixtures
const product = await getTestProduct();
await page.fill("#quantity", product.minQuantity.toString());

Fix Loop Results:

Iteration 1: 42 tests, 8 failures
  - Fixed 5 locator issues
  - Fixed 2 timing issues
  - Fixed 1 data issue

Iteration 2: 42 tests, 3 failures
  - Fixed 2 locator issues
  - Fixed 1 timing issue

Iteration 3: 42 tests, 0 failures ✓

Fix loop completed in 3 iterations.

Phase 5: Coverage Audit

Generates comprehensive coverage report and recommendations:

Coverage Report:

## Test Coverage Report

### Tests Generated: 42

- Happy Path: 12 tests
- Edge Cases: 8 tests
- Error Handling: 6 tests
- Performance: 4 tests
- Security: 5 tests
- Accessibility: 4 tests
- Integration: 3 tests

### Routes Covered: 12/15 (80%)

Uncovered routes:

- /admin/settings (requires admin role)
- /api/webhooks/\* (external integrations)
- /debug/\* (development-only)

### API Endpoints Covered: 18/22 (82%)

Uncovered endpoints:

- POST /api/admin/users (admin-only)
- DELETE /api/products/:id (soft delete, needs verification)
- GET /api/analytics (complex aggregations)
- POST /api/webhooks/stripe (external trigger)

### Bugs Found: 2

1. **CRITICAL**: Login form allows SQL injection via email field
   - Location: app/login/page.tsx:45
   - Test: e2e/security/sql-injection.spec.ts

2. **MEDIUM**: Checkout fails with international phone numbers
   - Location: lib/validation.ts:12
   - Test: e2e/edge-cases/international-phone.spec.ts

### Recommendations

1. Add admin-role tests (requires admin user setup)
2. Add webhook integration tests (requires test webhook server)
3. Add load testing for high-traffic endpoints
4. Increase accessibility coverage (current: 9%, target: 15%)

Customization Options

Custom Locator Strategies:

// e2e/playwright.config.ts
export default defineConfig({
  use: {
    // Prioritize test IDs
    testIdAttribute: "data-testid",
  },
});

Custom Test Categories:

# Add custom category during generation
custom_categories = [
    "happy-path",
    "edge-cases",
    "error-handling",
    "performance",
    "security",
    "accessibility",
    "integration",
    "custom-business-rules"  # Your custom category
]

Custom Seed Data:

// e2e/fixtures/custom-data.json
{
  "scenarios": [
    {
      "name": "bulk-order",
      "users": [...],
      "products": [...],
      "expected_discount": 0.15
    }
  ]
}

Configuration

Environment Variables:

# .env.test
DATABASE_URL=postgresql://test:test@localhost:5432/testdb  # pragma: allowlist secret
API_BASE_URL=http://localhost:8000/api
FRONTEND_URL=http://localhost:3000
ADMIN_EMAIL=admin@example.com
ADMIN_PASSWORD=Admin123!  # pragma: allowlist secret

Playwright Options:

// e2e/playwright.config.ts
export default defineConfig({
  // Test execution
  workers: 1, // MANDATORY: prevents data races
  fullyParallel: false, // Sequential execution
  retries: process.env.CI ? 2 : 0,

  // Timeouts
  timeout: 30000, // Per-test timeout
  expect: { timeout: 5000 }, // Assertion timeout

  // Artifacts
  use: {
    screenshot: "only-on-failure",
    video: "retain-on-failure",
    trace: "on-first-retry",
  },
});

LEVEL 3: Advanced Usage

Integration with Other Skills

With test-gap-analyzer:

> analyze test gaps and generate e2e tests to fill them

The skill:

Runs test-gap-analyzer to identify uncovered flows
Prioritizes test generation based on gap analysis
Focuses on high-risk uncovered areas

With shadow-testing:

> generate e2e tests and run shadow testing against production

The skill:

Generates full test suite
Configures shadow-testing with production mirror
Runs tests against both environments
Reports discrepancies

With qa-team methodology (formerly outside-in-testing):

The skill inherently follows outside-in testing:

Starts from user-facing UI
Tests through all layers (UI → API → DB)
Validates end-to-end contract adherence
No mocking (real integration)

Custom Template Development

Template Structure:

# e2e/templates/custom-template.py
CUSTOM_TEST_TEMPLATE = """
import {{ test, expect }} from '@playwright/test';
import {{ {helpers} }} from '../test-helpers/{helper_module}';

test.describe('{feature_name}', () => {{
  test.beforeEach(async ({{ page }}) => {{
    // Setup
    {setup_code}
  }});

  test('{test_description}', async ({{ page }}) => {{
    // Arrange
    {arrange_code}

    // Act
    {act_code}

    // Assert
    {assert_code}
  }});
}});
"""

Using Custom Templates:

from e2e_outside_in_test_generator.template_manager import TemplateManager

template_manager = TemplateManager()
template_manager.register_template("custom-flow", CUSTOM_TEST_TEMPLATE)

test_code = template_manager.render("custom-flow", {
    "feature_name": "Payment Processing",
    "helpers": "login, checkout",
    "helper_module": "payment",
    "setup_code": "await setupPaymentGateway();",
    "arrange_code": "const cart = await createCart();",
    "act_code": "await processPayment(cart);",
    "assert_code": "await expect(page.getByText(/payment successful/i)).toBeVisible();"
})

Advanced Locator Strategies

Priority Hierarchy:

Role-based (preferred): getByRole('button', { name: /submit/i })
User-visible text: getByText(/welcome/i)
Test IDs: getByTestId('submit-button')
CSS selectors (last resort): locator('.submit-button')

Custom Locator Builders:

// e2e/test-helpers/locators.ts
export function findByDataAttribute(page: Page, attr: string, value: string) {
  return page.locator(`[data-${attr}="${value}"]`);
}

export function findByAriaLabel(page: Page, label: string) {
  return page.locator(`[aria-label*="${label}" i]`);
}

// Usage in tests
await findByDataAttribute(page, "action", "submit").click();

Performance Optimization

Test Execution Time:

// Group fast tests together
test.describe("Quick smoke tests", () => {
  test("homepage renders", async ({ page }) => {
    /* <1s */
  });
  test("navigation works", async ({ page }) => {
    /* <1s */
  });
});

// Isolate slow tests
test.describe("Full checkout flow @slow", () => {
  test("complete purchase", async ({ page }) => {
    /* 10s */
  });
});

Run profiles:

// package.json
{
  "scripts": {
    "test:e2e": "playwright test",
    "test:e2e:quick": "playwright test --grep-invert @slow",
    "test:e2e:full": "playwright test",
    "test:e2e:smoke": "playwright test e2e/happy-path"
  }
}

CI/CD Integration

GitHub Actions:

# .github/workflows/e2e.yml
name: E2E Tests

on: [push, pull_request]

jobs:
  e2e:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-node@v3
        with:
          node-version: "18"

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright
        run: npx playwright install --with-deps

      - name: Run E2E tests
        run: npm run test:e2e
        env:
          DATABASE_URL: ${{ secrets.TEST_DATABASE_URL }}

      - uses: actions/upload-artifact@v3
        if: failure()
        with:
          name: playwright-report
          path: playwright-report/

Troubleshooting

Common Issues:

Flaky tests due to parallel execution
- Cause: workers > 1 causes database race conditions
- Fix: Ensure workers: 1 in playwright.config.ts
Locator timeouts
- Cause: Element not yet rendered
- Fix: Add explicit waits: await page.waitForLoadState('networkidle')
Test data conflicts
- Cause: Shared test database without cleanup
- Fix: Use test.beforeEach() to reset database state
Authentication failures
- Cause: Session timeout or incorrect credentials
- Fix: Verify test credentials in e2e/fixtures/users.json

Debug Commands:

# Run single test in headed mode
npx playwright test e2e/happy-path/login.spec.ts --headed

# Run with debugger
npx playwright test --debug

# Generate trace for failing test
npx playwright test --trace on

# View trace
npx playwright show-trace trace.zip

Best Practices

Keep workers at 1 - Never increase for deterministic test data
Use role-based locators - Most resilient to UI changes
Small seed datasets - 10-20 records max for predictability
Explicit waits - Avoid race conditions with waitForLoadState()
Cleanup between tests - Use beforeEach to reset state
Test isolation - Each test should run independently
Meaningful assertions - Test business logic, not implementation

Success Metrics

After implementing this skill, you should achieve:

Test count: 40+ tests (all 7 categories)
Pass rate: 100% after fix loop
Execution time: <2 minutes total
Bug detection: ≥1 real bug found
Coverage: ≥80% of routes and endpoints
Flakiness: 0% (deterministic)
CI integration: Green checks on every PR

See also:

README.md - Developer documentation
examples.md - Usage examples
reference.md - API reference
patterns.md - Common patterns

e2e-outside-in-test-generator