testing-strategies
SKILL.md
Testing Strategies Skill
Test the behavior, not the implementation. Test the boundaries, not the happy path.
Testing Pyramid
| Level | Volume | Speed | Cost to Maintain | What It Catches |
|---|---|---|---|---|
| Unit | Many (70%) | < 10ms each | Low | Logic errors, edge cases, regressions |
| Integration | Some (20%) | < 1s each | Medium | Wiring bugs, API contracts, data flow |
| E2E | Few (10%) | 5-30s each | High | User journey failures, deployment issues |
Anti-pattern: Inverted pyramid (too many E2E, few unit) → slow CI, flaky tests, hard to debug. Anti-pattern: Ice cream cone (manual testing on top of everything) → doesn't scale.
Unit Test Pattern (AAA)
test('should calculate discount when order exceeds $100', () => {
// Arrange
const order = createOrder({ subtotal: 150, customerTier: 'gold' });
// Act
const discount = calculateDiscount(order);
// Assert
expect(discount).toBe(15); // 10% for gold tier
});
Naming convention: should [expected behavior] when [condition] — reads as a specification.
Test Types Beyond the Pyramid
| Type | Purpose | When to Use | Example |
|---|---|---|---|
| Snapshot | Detect unexpected output changes | UI components, serialized data | expect(render(<Button/>)).toMatchSnapshot() |
| Contract | Verify API shape between services | Microservices, public APIs | Pact, OpenAPI validation |
| Property-based | Find edge cases humans miss | Pure functions, parsers, serializers | fc.assert(fc.property(fc.string(), s => decode(encode(s)) === s)) |
| Mutation | Verify tests actually catch bugs | Critical business logic | Stryker, pitest |
| Performance | Catch regressions in speed/memory | Hot paths, API endpoints | Benchmark before/after |
| Smoke | Verify deployment didn't break basics | Post-deploy, staging | Hit health endpoint + key pages |
What to Mock (and What Not To)
| Mock This | Why | Don't Mock This | Why |
|---|---|---|---|
| External HTTP APIs | Unreliable, slow, costly | Your own business logic | You'd be testing your mocks |
| Database in unit tests | Slow, stateful | Database in integration tests | That's the whole point |
Time (Date.now) |
Non-deterministic | Pure functions | Already deterministic |
| File system | Side effects | In-memory equivalents | Faster than mocking |
| Random/UUID | Non-deterministic | Framework internals | Not your responsibility |
Coverage Philosophy
| Range | Interpretation | Action |
|---|---|---|
| < 50% | Probably missing critical paths | Increase |
| 50-70% | Reasonable for most projects | Focus on changed code |
| 70-85% | Good, diminishing returns starting | Maintain, don't chase |
| 85-100% | Often wasteful unless safety-critical | Review if the effort is worth it |
The real metric: Coverage of changed code in each PR, not overall percentage.
What coverage doesn't tell you: That your tests assert the right things. 100% coverage with no assertions is useless.
What NOT to Test
| Don't Test | Why | Instead |
|---|---|---|
| Third-party library internals | Not your code | Trust it (or pick a different library) |
| Framework behavior | Already tested upstream | Test your code that uses the framework |
| Private implementation details | Breaks on refactor | Test the public interface |
| Trivial getters/setters | No logic to break | Only if they have side effects |
| Generated code | Changes on regeneration | Test the generator, not the output |
Test Quality Signals
| Good Test | Bad Test |
|---|---|
| Fails when behavior breaks | Fails when implementation changes |
| One clear reason to fail | Multiple assertions testing different things |
| Self-contained | Depends on other test order |
| Fast (< 100ms unit) | Slow due to unnecessary setup |
| Readable as documentation | Requires reading source to understand |
| Deterministic | Flaky (passes sometimes) |
TDD Cycle
| Step | Action | Common Mistake |
|---|---|---|
| Red | Write a failing test | Writing too much test (test the next small behavior) |
| Green | Make it pass with minimal code | Over-engineering the solution |
| Refactor | Clean up while green | Skipping this step (accumulates debt) |
TDD is not always the right choice: It works best for well-understood requirements. For exploratory code, write tests after the design stabilizes.
Flaky Test Triage
| Pattern | Likely Cause | Fix |
|---|---|---|
| Fails 1 in 10 runs | Timing/race condition | Add proper waits, remove shared state |
| Fails only in CI | Environment difference | Pin versions, use containers |
| Fails after another test | Test pollution | Isolate setup/teardown |
| Fails on slow machines | Hardcoded timeouts | Use retry with backoff or event-based waits |
Synapses
See synapses.json for connections.
Weekly Installs
1
Repository
fabioc-aloha/lithiumFirst Seen
7 days ago
Security Audits
Installed on
zencoder1
amp1
cline1
openclaw1
opencode1
cursor1