test-environments
Discovery Questions
- How many environments exist today? Local dev, CI, staging, preview, production? Map what you have before designing what you need.
- Is the app containerized? Docker/Docker Compose in use? Check for
Dockerfile,docker-compose.yml, orcompose.yaml. - How is test data seeded? Manual SQL scripts, migration-based, factory libraries, or snapshots from production?
- How close is staging to production? Same infrastructure (K8s, managed DB, CDN)? Same data shape? Same config?
- External dependencies: How many third-party APIs does the system call? Are they stubbed in non-production environments?
- Check
.agents/qa-project-context.mdfirst. Respect existing infrastructure decisions and constraints.
Core Principles
1. Staging must mirror production. If staging uses SQLite and production uses PostgreSQL, staging tests prove nothing. Match the database engine, the queue system, the cache layer, and the auth provider.
2. Ephemeral environments beat long-lived ones. A shared staging environment becomes a bottleneck where one broken deploy blocks the entire team. Per-PR preview environments provide isolation and parallel testing.
3. Deterministic seed data, not production copies. Production snapshots contain PII, stale references, and non-reproducible state. Build seed data from factories that generate consistent, valid, minimal datasets.
4. Stub external dependencies at the boundary, not deep inside. Third-party APIs are unreliable, rate-limited, and expensive. Stub them at the HTTP boundary using WireMock, MSW, or contract-verified fakes -- never by mocking internal service classes.
5. Environment config is code. Every environment difference (URLs, feature flags, credentials, resource limits) must be version-controlled and reviewable. No manual configuration that cannot be reproduced.
Environment Strategy
Environment Tiers
| Environment | Purpose | Data | External Deps | Lifecycle |
|---|---|---|---|---|
| Local dev | Fast inner loop | Seeded fixtures, minimal | Stubbed (MSW/WireMock) | Developer-managed |
| CI | Automated validation | Seeded per-run, ephemeral | Stubbed or containerized | Created/destroyed per pipeline |
| Preview | PR-level review & E2E | Seeded from factories | Stubbed or sandbox | Created on PR, destroyed on merge |
| Staging | Pre-production validation | Anonymized production-like | Real integrations (sandbox accounts) | Long-lived, regularly reset |
| Production | Live users | Real | Real | Permanent |
Local Development
Fast feedback, zero shared state. Developers must be able to run the full stack locally in under 2 minutes.
# One-command local environment
docker compose -f docker-compose.test.yml up -d
npm run db:seed
npm run dev
Local environment uses Docker Compose for infrastructure deps (database, cache, message queue) but runs the application natively for fast reload. External APIs are stubbed with MSW handlers loaded automatically in dev mode.
CI Environment
Fully containerized, created fresh for every pipeline run, destroyed after. No shared state between runs.
# .github/workflows/test.yml
services:
postgres:
image: postgres:16-alpine
env:
POSTGRES_DB: testdb
POSTGRES_USER: test
POSTGRES_PASSWORD: test
ports: ['5432:5432']
options: >-
--health-cmd="pg_isready -U test"
--health-interval=5s
--health-timeout=3s
--health-retries=5
redis:
image: redis:7-alpine
ports: ['6379:6379']
options: >-
--health-cmd="redis-cli ping"
--health-interval=5s
--health-timeout=3s
--health-retries=5
Preview Environments (Per-PR)
Each pull request gets its own isolated environment. Reviewers can click a link and test the exact changes without interfering with other PRs.
Vercel/Netlify (frontend):
# Automatic -- just connect the repo. Each PR gets a preview URL.
# Add E2E tests against the preview URL:
- name: Run E2E against preview
env:
BASE_URL: ${{ steps.deploy.outputs.preview-url }}
run: npx playwright test --project=chromium
Custom preview with Docker and unique namespace:
- name: Deploy preview
run: |
NAMESPACE="pr-${{ github.event.number }}"
docker compose -f docker-compose.preview.yml \
-p "$NAMESPACE" up -d
echo "preview-url=https://${NAMESPACE}.preview.example.com" >> "$GITHUB_OUTPUT"
- name: Teardown preview
if: github.event.action == 'closed'
run: |
NAMESPACE="pr-${{ github.event.number }}"
docker compose -p "$NAMESPACE" down -v
Staging
Long-lived environment that mirrors production infrastructure. Reset weekly or on-demand to prevent drift.
# Weekly staging reset (scheduled CI job)
#!/bin/bash
set -euo pipefail
echo "Resetting staging database..."
psql "$STAGING_DATABASE_URL" -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"
echo "Running migrations..."
npm run db:migrate -- --env staging
echo "Seeding anonymized data..."
npm run db:seed -- --env staging --dataset production-anonymized
echo "Verifying staging health..."
curl -sf https://staging.example.com/health || exit 1
echo "Staging reset complete."
Docker Compose for Testing
A production-quality docker-compose.test.yml that spins up the full stack for integration and E2E tests.
# docker-compose.test.yml
name: app-test
services:
app:
build:
context: .
dockerfile: Dockerfile
target: test # Multi-stage: use the test stage
ports:
- "3000:3000"
environment:
NODE_ENV: test
DATABASE_URL: postgres://test:test@postgres:5432/testdb
REDIS_URL: redis://redis:6379
STRIPE_API_KEY: sk_test_fake # Test-mode key, never real
EMAIL_PROVIDER: stub # Internal stub, no real emails
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
seed:
condition: service_completed_successfully
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 5s
timeout: 3s
retries: 10
postgres:
image: postgres:16-alpine
environment:
POSTGRES_DB: testdb
POSTGRES_USER: test
POSTGRES_PASSWORD: test
volumes:
- postgres-test-data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U test -d testdb"]
interval: 3s
timeout: 2s
retries: 10
redis:
image: redis:7-alpine
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 3s
timeout: 2s
retries: 10
seed:
build:
context: .
dockerfile: Dockerfile
target: seed
environment:
DATABASE_URL: postgres://test:test@postgres:5432/testdb
depends_on:
postgres:
condition: service_healthy
command: ["npm", "run", "db:seed"]
mailhog:
image: mailhog/mailhog:latest
ports:
- "8025:8025" # Web UI for inspecting sent emails
- "1025:1025" # SMTP
volumes:
postgres-test-data:
Running Tests Against Docker Compose
#!/bin/bash
# scripts/test-integration.sh
set -euo pipefail
COMPOSE_FILE="docker-compose.test.yml"
cleanup() {
echo "Tearing down test environment..."
docker compose -f "$COMPOSE_FILE" down -v --remove-orphans
}
trap cleanup EXIT
echo "Starting test infrastructure..."
docker compose -f "$COMPOSE_FILE" up -d --wait --wait-timeout 60
echo "Running integration tests..."
DATABASE_URL="postgres://test:test@localhost:5432/testdb" \
REDIS_URL="redis://localhost:6379" \
npx vitest run --project=integration
echo "Tests complete."
Multi-Stage Dockerfile for Test Environments
# Dockerfile
FROM node:20-alpine AS base
WORKDIR /app
COPY package*.json ./
RUN npm ci --production=false
FROM base AS test
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]
FROM base AS seed
COPY prisma/ ./prisma/
COPY scripts/seed.ts ./scripts/
COPY tsconfig.json ./
CMD ["npx", "tsx", "scripts/seed.ts"]
External Dependency Management
Stubbing Strategy by Dependency Type
| Dependency Type | Local/CI Strategy | Staging Strategy |
|---|---|---|
| Payment (Stripe) | MSW handler returning mock responses | Stripe test mode with sk_test_ keys |
| Email (SendGrid) | MailHog/Mailpit capturing SMTP | SendGrid sandbox mode |
| Auth (Auth0) | Local JWT issuer with test keys | Auth0 dev tenant |
| Storage (S3) | MinIO container (S3-compatible) | Dedicated test bucket with lifecycle policy |
| Search (Elasticsearch) | Testcontainers Elasticsearch | Dedicated test index with reset script |
| SMS (Twilio) | MSW handler | Twilio test credentials |
MSW Handlers for External APIs
// test/mocks/handlers.ts
import { http, HttpResponse } from "msw";
export const handlers = [
// Stripe: create payment intent
http.post("https://api.stripe.com/v1/payment_intents", async ({ request }) => {
const body = await request.text();
const params = new URLSearchParams(body);
const amount = params.get("amount");
return HttpResponse.json({
id: "pi_test_" + Date.now(),
amount: Number(amount),
currency: params.get("currency") ?? "usd",
status: "requires_payment_method",
client_secret: "pi_test_secret_" + Date.now(),
});
}),
// SendGrid: send email
http.post("https://api.sendgrid.com/v3/mail/send", () => {
return HttpResponse.json({ message: "success" }, { status: 202 });
}),
// Geocoding API
http.get("https://maps.googleapis.com/maps/api/geocode/json", ({ request }) => {
const url = new URL(request.url);
const address = url.searchParams.get("address");
return HttpResponse.json({
results: [{
formatted_address: address,
geometry: { location: { lat: 40.7128, lng: -74.006 } },
}],
status: "OK",
});
}),
];
// test/mocks/setup.ts
import { setupServer } from "msw/node";
import { handlers } from "./handlers";
export const server = setupServer(...handlers);
// In vitest.setup.ts or jest.setup.ts:
beforeAll(() => server.listen({ onUnhandledRequest: "error" }));
afterEach(() => server.resetHandlers());
afterAll(() => server.close());
Setting onUnhandledRequest: "error" ensures tests fail loudly if they hit an unmocked external API -- no silent network calls leaking into test runs.
MinIO as S3 Substitute
# In docker-compose.test.yml
minio:
image: minio/minio:latest
ports:
- "9000:9000"
- "9001:9001" # Console
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
command: server /data --console-address ":9001"
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 5s
timeout: 3s
retries: 5
// Configure S3 client to point at MinIO in tests
import { S3Client } from "@aws-sdk/client-s3";
const s3 = new S3Client({
endpoint: process.env.S3_ENDPOINT ?? "http://localhost:9000",
region: "us-east-1",
credentials: {
accessKeyId: process.env.S3_ACCESS_KEY ?? "minioadmin",
secretAccessKey: process.env.S3_SECRET_KEY ?? "minioadmin",
},
forcePathStyle: true, // Required for MinIO
});
Contract Testing as Stub Validation
Stubs drift from reality. Pair every stub with a contract test that verifies the stub matches the real API. For details, see contract-testing.
Environment Parity Checklist
Run this checklist when setting up or auditing a non-production environment.
| Dimension | Question | Red Flag |
|---|---|---|
| Database engine | Same engine and version as production? | SQLite in test, PostgreSQL in prod |
| Database schema | Same migration pipeline applied? | Manual schema changes in staging |
| Data shape | Seed data covers all entity states? | Only "happy path" records, no edge cases |
| Infrastructure | Same container orchestration? | Docker Compose in CI, Kubernetes in prod |
| Network | Same internal service topology? | Monolith in test, microservices in prod |
| Config | Environment variables documented and version-controlled? | Undocumented env vars, manual setup |
| Auth | Same auth provider/flow? | Bypassed auth in test with hardcoded tokens |
| Feature flags | Same flag evaluation engine? | Hardcoded flags in test, LaunchDarkly in prod |
| TLS/HTTPS | Same certificate handling? | HTTP in staging, HTTPS in prod |
| Timeouts/Limits | Same rate limits, connection pools, timeouts? | Infinite timeouts in test hide perf issues |
For factory-based seed data patterns, see test-data-management.
Anti-Patterns
Shared staging as the only test environment. One developer's broken deploy blocks everyone. Use ephemeral per-PR environments for isolation and keep staging for final pre-production validation only.
Production database copies for test data. PII risk, non-reproducible state, massive datasets that slow tests. Build minimal seed data from factories with deterministic values.
Environment-specific code paths. if (process.env.NODE_ENV === "test") { skipAuth(); } means you are not testing the real auth flow. Use dependency injection or configuration to swap implementations, not environment conditionals.
Manual environment setup. If setting up the test environment requires a wiki page with 15 steps, it will be wrong within a week. Script everything: docker compose up -d && npm run db:seed should be the only steps.
Stubbing internal services instead of external ones. Stub at the HTTP boundary where your system talks to the outside world. Stubbing internal modules hides integration bugs between your own services.
No health checks in Docker Compose. Without health checks, depends_on only waits for the container to start, not for the service to be ready. Tests start before the database accepts connections and fail with connection errors.
Long-lived preview environments. Preview environments that persist after the PR is merged waste resources and accumulate stale state. Automate teardown on PR close.
Done When
- Environment inventory documented (dev, staging, preview, production) with characteristics and access notes for each tier
- Docker Compose config for the local environment verified working with a single
docker compose upcommand - Seed data scripts are idempotent and checked into the repository
- Environment parity gaps documented (e.g., SQLite in CI vs PostgreSQL in prod) with mitigations in place or tracked
- Preview environments auto-created for PRs and auto-torn-down on merge or close
Related Skills
- test-data-management -- Factory patterns, synthetic data generation, database seeding strategies.
- ci-cd-integration -- Pipeline configuration, GitHub Actions services, artifact management.
- contract-testing -- Consumer-driven contracts that validate your stubs match real APIs.
- service-virtualization -- Decision framework for choosing mocks, stubs, fakes, or real services.