flightplanner

SKILL.md

Flightplanner Skill

You are an expert at writing, maintaining, and reasoning about end-to-end (E2E) tests. You follow spec-driven testing practices where E2E_TESTS.md files are the single source of truth, and test code is generated and maintained from those specifications.

Core Principles

1. Specs Are the Source of Truth

All E2E test behavior is defined in E2E_TESTS.md specification files. Tests are generated from specs, not the other way around. When specs and tests disagree, the spec wins.

  • Root-level docs/E2E_TESTS.md or E2E_TESTS.md defines project-wide testing philosophy
  • Package-level E2E_TESTS.md files define specific test cases
  • Never modify specs to match broken tests — fix the tests

2. Complete Test Isolation

Every test must be independent. No shared state, no ordering dependencies.

  • Each test gets its own temporary directory
  • Environment variables are saved and restored
  • Git repositories are created fresh per test
  • Background processes are terminated in cleanup
  • See: reference/isolation.md

3. Resilient Cleanup

Cleanup failures must never fail tests. Use best-effort cleanup with retries.

  • Always use safeCleanup() — never raw recursive delete
  • Clean up in reverse creation order
  • Restore process state (CWD, env vars) before removing files
  • See: reference/cleanup.md

4. Mock Only at System Boundaries

Prefer real implementations. Mock only external, slow, expensive, or non-deterministic dependencies.

  • Use real file systems and git repositories
  • Mock external CLI tools via PATH injection (not framework mocking)
  • Use conditional skip for tests requiring real external services
  • See: reference/mocking.md

5. Local Tests Must Always Be Runnable

The default E2E test suite must be fully self-contained and runnable without access to any remote or live services. Tests that depend on remote services (external APIs, live backends, cloud infrastructure, real AI agents) must be skippable so that the completely local test suite can be run at all times — in CI, offline, and during development. Remote-dependent tests are opt-in, never opt-out.

  • Prefer the test framework's native filtering or tagging mechanism (e.g., tags, groups, categories) to separate local from remote-dependent tests
  • If the framework lacks native filtering, use environment variables to control skipping — and those variables must be documented in CONTRIBUTING.md or equivalent project contributor documentation
  • See: reference/mocking.md

6. Setup-Execute-Verify

Every test follows three phases:

Setup   → prepare the specific state for this test
Execute → perform the single action under test
Verify  → assert the expected outcomes

7. Autogenerated Tests

Test files include headers/footers indicating they are autogenerated. Manual modifications are overwritten on regeneration. To change tests, update the spec.

8. Execute Before Trusting

Never assume generated test code works until it has been executed. Every test generation or modification must be followed by actually running the tests. If a test passes but the underlying feature is broken, the test is wrong. When feasible, also exercise the code under test directly (run the CLI, curl the API, open the UI) to verify behavior beyond what automated tests cover.

9. Run Tests First

Before modifying any test code, run the existing test suite to establish a known baseline. This reveals pre-existing failures, confirms which tests currently pass, and prevents conflating new breakage with old. If existing tests fail, note them so they are not confused with regressions introduced by your changes.

Spec Format Summary

Each E2E_TESTS.md contains suites with this structure:

## <Suite Name>

### Preconditions
- Required setup (maps to per-test or per-suite setup hooks)

### Features

#### <Feature Name>
<!-- category: core|edge|error|side-effect|idempotency -->
- Assertion 1
- Assertion 2

### Postconditions
- Verifiable end states

Feature Categories

Category Purpose
core Happy-path, primary functionality
edge Boundary conditions, unusual-but-valid inputs
error Failure modes, error handling
side-effect External interactions, hooks, notifications
idempotency Safe repetition of operations

Metadata Comments

<!-- category: core -->           Required: test category
<!-- skip: requires-real-agent --> Optional: generates skipped test
<!-- tags: slow, docker -->        Optional: arbitrary tags

Full format specification: reference/spec-format.md

Test Organization

File Naming

<feature>.e2e.test.<ext>

E2E tests MUST live in their own dedicated files, separate from unit tests, integration tests, or manually-written tests. This prevents merge conflicts between autogenerated E2E files and hand-maintained test files, and avoids accidental overwrites when fp-update regenerates E2E test code. See reference/organization.md for details.

Directory Layout

package/
├── src/commands/__tests__/
│   ├── e2e-utils.ts          # Shared helpers
│   ├── init.e2e.test.ts      # One file per suite
│   ├── task.e2e.test.ts
│   └── fixtures/             # Test data
├── E2E_TESTS.md              # Spec file
└── vitest.e2e.config.ts      # E2E runner config

Mapping: Spec → Test

Spec Test Construct
Suite (##) Suite/group block (e.g., describe() in vitest) + test file
Preconditions Per-test setup hook (e.g., beforeEach in vitest)
Feature (####) Individual test case (e.g., it() / test() in vitest)
Bullets Assertion statements (e.g., expect() / assert in vitest)
Postconditions Final assertions + per-test teardown hook (e.g., afterEach in vitest)

Full organization guide: reference/organization.md

Mock Strategy Summary

Decision order:

  1. Can I use the real thing? → Use it
  2. Can I use a local substitute? → Use it
  3. Is the external thing being tested? → Need real/high-fidelity
  4. Is the cost too high? → Mock it

PATH-based mocking for CLI tools:

createMockTool("docker", exitCode=0, output="Docker version 24.0.0")
env.PATH = mockBinDir + ":" + originalPath

Conditional skip for optional dependencies:

SKIP_REAL_AGENT = env.E2E_REAL_AGENT != "true"
suite.skipIf(SKIP_REAL_AGENT) "real agent tests":
  ...

Full mocking guide: reference/mocking.md

Commands

Command Description Modifies Code?
fp-init Bootstrap E2E specs for a project from release history and source analysis Yes
fp-audit Analyze spec-to-test coverage gaps No
fp-review-spec Validate spec completeness and format No
fp-generate Generate tests from spec (full suite) Yes
fp-add Add feature or suite to spec + generate tests Yes
fp-update Sync tests with current spec state Yes
fp-fix Fix failing tests (never modifies specs) Yes
fp-smoke-test Exercise the application directly to verify behavior beyond automated tests No
fp-add-spec Create new E2E_TESTS.md for a package Yes
fp-update-spec Update spec from git log / new features Yes

Workflow

Starting Fresh (no specs exist)

  1. Run fp-init to bootstrap E2E_TESTS.md files across the project from release history and source analysis
  2. Run fp-review-spec to validate completeness
  3. Run fp-generate to create test files

Adding Specs to a Single Package

  1. Run fp-add-spec to create E2E_TESTS.md by analyzing the package
  2. Run fp-review-spec to validate completeness
  3. Run fp-generate to create test files

Adding New Features

  1. Run fp-add with a description of the feature
  2. It detects whether to add to an existing suite or create a new one
  3. Updates the spec and generates/updates tests

Maintaining Tests

  1. Run fp-audit to check coverage
  2. Run fp-update to sync tests with spec changes
  3. Run fp-fix to repair failing tests

After Code Changes

  1. Run fp-update-spec to reflect new functionality in specs
  2. Run fp-update to regenerate tests from updated specs

Verifying Beyond Tests

Run fp-smoke-test to exercise the application directly and verify that features work end-to-end in a real environment, not just in isolated test cases.

Key Conventions

  • All examples use pseudocode — adapt to the project's actual language and test framework
  • Specs use HTML comments for metadata — machine-parseable, invisible when rendered
  • Tests are autogenerated — never hand-edit generated test files
  • Cleanup never fails tests — best-effort with retries
  • Real over mock — prefer real file systems, real git, real processes
  • Sequential execution — E2E tests run in a single fork to avoid resource conflicts

Reference Documents

  • reference/spec-format.md — Complete guide to E2E_TESTS.md format
  • reference/isolation.md — Test isolation and state leak patterns
  • reference/cleanup.md — Resilient cleanup and retry patterns
  • reference/mocking.md — Mock decision framework and patterns
  • reference/organization.md — File naming, structure, and spec-to-test mapping
  • reference/manual-verification.md — Manual verification patterns by application type
Weekly Installs
5
GitHub Stars
3
First Seen
4 days ago
Installed on
opencode5
gemini-cli5
github-copilot5
codex5
kimi-cli5
cursor5