testing by scientiacapital/skills

This skill emphasizes writing tests that provide confidence without becoming maintenance burdens. Tests should be fast, reliable, and focused on behavior rather than implementation details.

<quick_start> TDD Red-Green-Refactor cycle:

RED: Write a failing test first

test('adds numbers', () => {
  expect(add(1, 2)).toBe(3);  // Fails - add() doesn't exist
});

GREEN: Write minimum code to pass

const add = (a, b) => a + b;  // Test passes

REFACTOR: Clean up while tests stay green

Test pyramid: 70% unit, 25% integration, 5% E2E </quick_start>

<success_criteria> Testing is successful when:

TDD cycle followed: test written before implementation code
Test pyramid balanced: ~70% unit, ~25% integration, ~5% E2E
Tests are independent and can run in any order
No flaky tests (run 3x to verify reliability)
Coverage meets targets: 70-80% lines, 100% critical paths
Test names describe behavior (what + when + expected result)
Mocks only used for external dependencies, not own code </success_criteria>

<core_principles>

The Testing Mindset

Tests are documentation - A failing test is a specification that hasn't been implemented
Test behavior, not implementation - Tests should survive refactoring
Fast feedback loops - Unit tests run in milliseconds, not seconds
Isolation by default - Each test should be independent
Arrange-Act-Assert - Clear structure in every test </core_principles>

<tdd_workflow>

TDD: Red-Green-Refactor

┌─────────────────────────────────────────────────────────┐
│                    TDD CYCLE                             │
│                                                          │
│    ┌─────────┐                                          │
│    │   RED   │ ◄─── Write a failing test                │
│    └────┬────┘                                          │
│         │                                                │
│         ▼                                                │
│    ┌─────────┐                                          │
│    │  GREEN  │ ◄─── Write minimum code to pass          │
│    └────┬────┘                                          │
│         │                                                │
│         ▼                                                │
│    ┌─────────┐                                          │
│    │REFACTOR │ ◄─── Clean up while tests stay green     │
│    └────┬────┘                                          │
│         │                                                │
│         └──────────────► Back to RED                    │
└─────────────────────────────────────────────────────────┘

The Rules

Write a failing test first - Never write production code without a failing test
Write only enough test to fail - Compilation failures count as failures
Write only enough code to pass - No more, no less
Refactor only when green - Never refactor with failing tests

Common TDD Mistakes

Mistake	Why It's Wrong	Instead
Writing tests after code	Tests become confirmation bias	Red-Green-Refactor
Testing private methods	Tests implementation, not behavior	Test public interface
Big leaps in test complexity	Hard to debug failures	Baby steps
Skipping refactor step	Technical debt accumulates	Always clean up
</tdd_workflow>

<test_pyramid>

The Test Pyramid

                    ┌───────────┐
                    │    E2E    │  Few, slow, expensive
                    │   Tests   │  (minutes)
                    └─────┬─────┘
                          │
               ┌──────────┴──────────┐
               │   Integration Tests  │  Some, medium speed
               │   (API, Database)    │  (seconds)
               └──────────┬───────────┘
                          │
        ┌─────────────────┴─────────────────┐
        │          Unit Tests                │  Many, fast, cheap
        │    (Functions, Components)         │  (milliseconds)
        └────────────────────────────────────┘

Distribution Guidelines

Type	Percentage	Speed	Scope
Unit	70-80%	<10ms each	Single function/component
Integration	15-25%	<1s each	Multiple components, DB
E2E	5-10%	<30s each	Full user flows

What to Test Where

Unit Tests:

Pure functions
Business logic
Data transformations
Validation rules
Component rendering

Integration Tests:

API endpoints
Database operations
Service interactions
Component integration

E2E Tests:

Critical user flows (login, checkout)
Happy paths only
Smoke tests </test_pyramid>

<when_to_mock>

Mocking Strategy

The London vs Detroit Schools

London School (Mockist):

Mock all dependencies
Test in complete isolation
Tests are very focused

Detroit School (Classicist):

Only mock external services
Test natural units together
Tests are more realistic

Recommended: Pragmatic approach

Mock external services (APIs, DBs in unit tests)
Don't mock your own code unless necessary
Use real implementations in integration tests

What to Mock

Mock	Don't Mock
External APIs	Your own pure functions
File system (in unit tests)	Data transformations
Network requests	Business logic
Time/randomness	In-memory data structures
Expensive computations	Simple utilities

Mocking Patterns

// GOOD: Mock external dependency
const mockFetch = vi.fn().mockResolvedValue({ data: [] });

// BAD: Mocking your own utilities
const mockFormatDate = vi.fn(); // Don't do this

// GOOD: Dependency injection for testability
function createService(httpClient = fetch) {
  return {
    getData: () => httpClient('/api/data')
  };
}

// In test:
const mockClient = vi.fn();
const service = createService(mockClient);

</when_to_mock>

<test_structure>

Test Organization

File Naming

src/
├── components/
│   ├── Button.tsx
│   └── Button.test.tsx      # Colocated test
├── utils/
│   ├── format.ts
│   └── format.test.ts
└── __tests__/               # Or separate folder
    └── integration/
        └── api.test.ts

Test Naming

// Pattern: describe what + when + expected result
describe('UserService', () => {
  describe('createUser', () => {
    it('returns user object when given valid email', () => {});
    it('throws ValidationError when email is invalid', () => {});
    it('sends welcome email after successful creation', () => {});
  });
});

// Alternative: BDD style
describe('UserService', () => {
  describe('when creating a user with valid data', () => {
    it('should return the created user', () => {});
    it('should send a welcome email', () => {});
  });

  describe('when email is invalid', () => {
    it('should throw ValidationError', () => {});
  });
});

Arrange-Act-Assert

it('calculates total with discount', () => {
  // Arrange - set up test data
  const cart = createCart([
    { price: 100, quantity: 2 },
    { price: 50, quantity: 1 }
  ]);
  const discount = 0.1;

  // Act - perform the action
  const total = calculateTotal(cart, discount);

  // Assert - verify result
  expect(total).toBe(225); // (200 + 50) * 0.9
});

</test_structure>

<what_not_to_test>

What NOT to Test

Skip These

Framework code - React's useState, Express routing
Third-party libraries - They have their own tests
Trivial getters/setters - No logic = no test needed
Implementation details - Private methods, internal state
One-line functions - Unless they have complex logic

Focus On

Business logic - Where bugs hide
Edge cases - Nulls, empty arrays, boundaries
Error paths - What happens when things fail
User-facing behavior - What users actually do
Regressions - Bugs that came back once

Coverage Targets

Metric	Target	Notes
Line coverage	70-80%	Higher isn't always better
Branch coverage	70-80%	More important than lines
Critical paths	100%	Auth, payments, data mutations

Warning: 100% coverage doesn't mean good tests. Bad tests can hit every line without testing anything meaningful. </what_not_to_test>

Topic	Reference File	When to Load
Unit testing patterns	`reference/unit-testing.md`	Writing unit tests, mocking
Integration testing	`reference/integration-testing.md`	API tests, database tests
Test organization	`reference/test-organization.md`	Structuring test suites
Coverage strategies	`reference/coverage-strategies.md`	Setting coverage goals

To load: Ask for the specific topic or check if context suggests it.

<framework_patterns>

Quick Reference by Framework

pytest (Python)

# Fixtures
@pytest.fixture
def user():
    return User(name="test")

def test_user_greet(user):
    assert user.greet() == "Hello, test"

# Parametrize
@pytest.mark.parametrize("input,expected", [
    ("hello", "HELLO"),
    ("world", "WORLD"),
])
def test_uppercase(input, expected):
    assert uppercase(input) == expected

vitest/jest (TypeScript)

// Basic test
test('adds numbers', () => {
  expect(add(1, 2)).toBe(3);
});

// Mock
vi.mock('./api', () => ({
  fetchUser: vi.fn().mockResolvedValue({ name: 'test' })
}));

// Component test
import { render, screen } from '@testing-library/react';

test('renders button', () => {
  render(<Button>Click</Button>);
  expect(screen.getByRole('button')).toHaveTextContent('Click');
});

Testing Library Principles

Query by role, not test ID
Test what users see, not implementation
Prefer userEvent over fireEvent
Avoid testing internal state </framework_patterns>

Before marking code complete:

Emit Outcome Sidecar

As the final step, write to ~/.claude/skill-analytics/last-outcome-testing.json:

{"ts":"[UTC ISO8601]","skill":"testing","version":"1.0.0","variant":"default",
 "status":"[success|partial|error]","runtime_ms":[estimated ms from start],
 "metrics":{"tests_written":[n],"coverage_delta_pct":[n]},
 "error":null,"session_id":"[YYYY-MM-DD]"}

Use status "partial" if some stages failed but results were produced. Use "error" only if no output was generated.