Testing Skill

When to use this skill

Use when writing tests of any kind: unit, integration, end-to-end, or when designing a testing strategy for a new feature.

Testing Philosophy

Core beliefs

Test behavior, not implementation. Tests should describe what the code does, not how it does it internally. If a refactor breaks your tests, the tests were wrong.
Tests are documentation. A well-written test suite tells you exactly what the system does and what edge cases exist.
Fast feedback > complete coverage. A fast test suite that runs on every change beats a slow, comprehensive suite that nobody runs.
One assertion per test concept. Each test should verify one behavior. If it fails, you know exactly what broke.

Test Pyramid

         ╱  E2E  ╲         Few — slow, expensive, high confidence
        ╱─────────╲
       ╱Integration╲       Some — moderate speed, test component interactions
      ╱─────────────╲
     ╱  Unit Tests   ╲     Many — fast, cheap, test logic in isolation
    ╱─────────────────╲

Level	What it tests	Speed	Count
Unit	Single function/class in isolation	< 10ms each	Many (80%+)
Integration	Multiple components working together	< 1s each	Some (15%)
E2E	Full user workflows through real UI	< 30s each	Few (5%)

Unit Tests

Structure: Arrange-Act-Assert (AAA)

// ARRANGE — set up the test data and dependencies
const user = createTestUser({ name: "Alice" })
const service = new UserService(mockRepo)

// ACT — perform the action being tested
const result = service.updateName(user.id, "Bob")

// ASSERT — verify the outcome
expect(result.name).toBe("Bob")

Naming convention

Test names should read like sentences:

✅ "returns empty list when no orders exist"
✅ "throws ValidationError when email is invalid"
✅ "sends notification email after successful signup"

❌ "test1"
❌ "testUpdateUser"
❌ "it works"

What to unit test

All business logic in service/ layer
Data transformation functions
Validation logic
Error handling paths
Edge cases: empty input, null, boundary values, overflow

What NOT to unit test

Framework boilerplate (routing config, middleware registration)
Simple getters/setters with no logic
Third-party library internals
Database queries (save for integration tests)

Integration Tests

What to test

Database queries (repo layer with real/test database)
API endpoints (HTTP request → response)
Component interactions (service A calls service B)
External service integrations (with containerized dependencies)

Rules

Use a real test database (not mocks) — spin up with Docker or in-memory
Reset state between tests — each test starts with a clean slate
Seed realistic data — don't test with just one row
Test the unhappy paths — what happens when the DB is down, the API returns 500?

End-to-End Tests

What to test

Critical user journeys — signup, login, purchase, core workflows
Cross-page flows — navigation, multi-step forms, redirects
Only test the happy path + most important failure per journey

Rules

Keep E2E tests minimal — they're slow and fragile
Use stable selectors — data-testid attributes, not CSS classes
Wait for elements — never use fixed sleep/delays
Isolate test data — each test creates its own data, cleans up after
Record failures — capture screenshots/video on test failure

E2E test structure

describe("User signup flow")
  it("creates account with valid details")
    → navigate to signup page
    → fill in form fields
    → submit
    → verify redirect to dashboard
    → verify welcome message displayed

  it("shows error for duplicate email")
    → navigate to signup page
    → fill in form with existing email
    → submit
    → verify error message displayed
    → verify user stays on signup page

Test Data

Rules

Never use production data in tests
Use factories/builders to create test data — not raw constructors
Make test data realistic — use realistic names, emails, amounts
Randomize where appropriate — catch assumptions with random data
Clean up — delete test data after each test run

Test data pattern

// Factory pattern — create test data with sensible defaults
function createTestUser(overrides = {}) {
  return {
    id: generateId(),
    name: "Test User",
    email: "test@example.com",
    role: "user",
    ...overrides  // allow overriding any field
  }
}

// Usage
const admin = createTestUser({ role: "admin" })
const unverified = createTestUser({ isVerified: false })

Mocking

When to mock

External services (APIs, payment providers, email)
Time-dependent logic (use a clock mock)
Randomness (seed the random generator)
Expensive operations (file I/O, network calls)

When NOT to mock

Your own code (test the real implementation)
The database (use a test database instead)
Simple utility functions

Mocking rules

Mock at the boundary — mock the external interface, not internal details
Verify mock interactions sparingly — prefer asserting on outputs over inputs
Keep mocks simple — if your mock is complex, the design might need rethinking

Common Anti-Patterns

Anti-pattern	Problem	Fix
Testing implementation details	Tests break on refactor	Test behavior and outputs instead
Shared mutable test state	Tests affect each other	Reset state before each test
Sleep/delay in tests	Slow, flaky	Use proper async waiting/polling
Ignored/skipped tests	Dead weight that rots	Fix or delete them immediately
Asserting on error messages	Fragile to wording changes	Assert on error codes/types
One giant test file	Hard to navigate	One test file per source file
Testing private methods	Couples to implementation	Test through the public API

Coverage

Don't chase 100% coverage — it's expensive and gives false confidence
Target: 80%+ for service/business logic, 60%+ overall
Coverage is a floor, not a ceiling — meaningful tests matter more than the number
Uncovered code should be intentional — add a comment if you deliberately skip testing something

PR Checklist for Tests

New logic has corresponding unit tests
Tests follow AAA structure (Arrange-Act-Assert)
Test names read like sentences
Edge cases covered (empty, null, boundary, error)
No sleep/fixed delays — uses proper async waiting
No shared mutable state between tests
Test data uses factories, not raw constructors
Mocks are at the boundary only
All tests pass locally (./scripts/dev/test)