NYC

run-tests

SKILL.md

Test Suite Validation Skill

This skill helps you efficiently validate code changes by running the appropriate subset of the test suite. It uses scripts/run-tests to intelligently discover affected tests and run only what's necessary for validation.

When to Use This Skill

Use this skill when you have:

  • Made changes to source code files and want to validate they work
  • Fixed a bug and want to verify the fix
  • Added a feature and need test coverage
  • Modified test infrastructure or configuration
  • Want to verify changes don't break existing functionality

Key Principles

  1. Always use the run-tests skill when testing code changes - it's optimized for intelligent suite discovery
  2. Never run pytest directly - bypasses the project's test infrastructure (use scripts/run-tests or riot via scripts/ddtest)
  3. Minimal venvs for iteration - run 1-2 venvs initially, expand only if needed
  4. Use --dry-run first - see what would run before executing
  5. Follow official docs - docs/contributing-testing.rst is the source of truth for testing procedures

How This Skill Works

Step 1: Identify Changed Files

First, determine which files were modified:

  • If you have pending edits, I'll identify the changed files from the current session
  • I'll look at git status to find staged, unstaged, and untracked changes
  • You can also specify files explicitly if working on specific changes

Step 2: Discover Available Test Suites

I'll use the scripts/run-tests script to discover what test suites match your changes:

scripts/run-tests --list <edited-files>

This outputs JSON showing:

  • Available test suites that match your changed files
  • All venvs (Python versions + package combinations) available for each suite
  • Their hashes, Python versions, and package versions

Step 3: Intelligently Select Venvs

Rather than running ALL available venvs (which could take hours), I'll select the minimal set needed to validate your changes:

For Core/Tracing Changes (Broad Impact)

When you modify files like:

  • ddtrace/internal/core/*, ddtrace/_trace/*, ddtrace/trace/*
  • ddtrace/_monkey.py, ddtrace/settings/*
  • ddtrace/constants.py

Strategy: Run core tracer + internal tests with 1 venv each

  • Example: tracer suite with latest Python + internal suite with latest Python
  • This validates broad-reaching changes without excessive overhead
  • Skip integration suites unless the change directly affects integration code

For Integration/Contrib Changes (Targeted Impact)

When you modify files like:

  • ddtrace/contrib/flask/*, ddtrace/contrib/django/*, etc.
  • ddtrace/contrib/*/patch.py or integration-specific code

Strategy: Run ONLY the affected integration suite with 1-2 venvs

  • Example: Flask changes → run contrib::flask suite with latest Python
  • If change involves multiple versions (e.g., Django 3.x and 4.x), pick 1 venv per major version
  • Skip unrelated integrations

For Test-Only Changes

When you modify tests/ files (but not test infrastructure):

  • Run only the specific test files/functions modified
  • Use pytest args: -- -k test_name or direct test file paths

For Test Infrastructure Changes

When you modify:

  • tests/conftest.py, tests/suitespec.yml, scripts/run-tests, riotfile.py

Strategy: Run a quick smoke test suite

  • Example: internal suite with 1 venv as a sanity check
  • Or run small existing test suites to verify harness changes

Step 4: Execute Selected Venvs

I'll run the selected venvs using:

scripts/run-tests --venv <hash1> --venv <hash2> ...

This will:

  • Start required Docker services (redis, postgres, etc.)
  • Run tests in the specified venvs sequentially
  • Stop services after completion
  • Show real-time output and status

Step 5: Handle Results

If tests pass: ✅ Your changes are validated!

If tests fail: 🔴 I'll:

  • Show you the failure details
  • Identify which venv failed
  • Ask clarifying questions to understand the issue
  • Offer to run specific failing tests with more verbosity
  • Help iterate on fixes and re-run

For re-running specific tests:

scripts/run-tests --venv <hash> -- -vv -k test_name

When Tests Fail

When you encounter test failures, follow this systematic approach:

  1. Review the failure details carefully - Don't just skim the error, understand what's actually failing
  2. Understand what's failing - Don't blindly re-run; analyze the root cause
  3. Make code changes - Fix the underlying issue
  4. Re-run with more verbosity if needed - Use -vv or -vvv for detailed output
  5. Iterate until tests pass - Repeat the process with each fix

Venv Selection Strategy in Detail

Understanding Venv Hashes

From scripts/run-tests --list, you'll see output like:

{
  "suites": [
    {
      "name": "tracer",
      "venvs": [
        {
          "hash": "abc123",
          "python_version": "3.8",
          "packages": "..."
        },
        {
          "hash": "def456",
          "python_version": "3.11",
          "packages": "..."
        }
      ]
    }
  ]
}

Selection Rules

  1. Latest Python version is your default choice

    • Unless your change specifically targets an older Python version
    • Example: if fixing Python 3.8 compatibility, also test 3.8
  2. One venv per suite is usually enough for iteration

    • Only run multiple venvs per suite if:
      • Change impacts multiple Python versions differently
      • Testing package compatibility variations (e.g., Django 3.x vs 4.x)
      • Initial validation passed and you want broader coverage
  3. Minimize total venvs

    • 1-2 venvs total for small targeted changes
    • 3-4 venvs maximum for broader changes
    • Never run 10+ venvs for initial validation (save that for CI)
  4. Consider test runtime

    • Each venv can take 5-30 minutes depending on suite
    • With 2 venvs you're looking at 10-60 minutes for iteration
    • With 5 venvs you're looking at 25-150 minutes
    • Scale appropriately for your patience and deadline

Using --venv Directly

When you have a specific venv hash you want to run, you can use it directly without specifying file paths:

scripts/run-tests --venv e06abee

The --venv flag automatically searches all available venvs across all suites, so it works regardless of what files you have locally changed. This is useful when:

  • You know exactly which venv you want to test
  • You have unrelated local changes that would otherwise limit suite matching
  • You want to quickly re-run a specific venv without file path arguments

Examples

Example 1: Fixing a Flask Integration Bug

Changed file: ddtrace/contrib/internal/flask/patch.py

scripts/run-tests --list ddtrace/contrib/internal/flask/patch.py
# Output shows: contrib::flask suite available

# Select output (latest Python):
# Suite: contrib::flask
# Venv: hash=e06abee, Python 3.13, flask

# Run with --venv directly (searches all venvs automatically)
scripts/run-tests --venv e06abee
# Runs just Flask integration tests

Example 2: Fixing a Core Tracing Issue

Changed file: ddtrace/_trace/tracer.py

scripts/run-tests --list ddtrace/_trace/tracer.py
# Output shows: tracer suite, internal suite available

# Select strategy:
# - tracer: latest Python (e.g., abc123)
# - internal: latest Python (e.g., def456)

# Run with --venv directly (searches all venvs automatically)
scripts/run-tests --venv abc123 --venv def456
# Validates core tracer and internal components

Example 3: Fixing a Test-Specific Bug

Changed file: tests/contrib/flask/test_views.py

scripts/run-tests --list tests/contrib/flask/test_views.py
# Output shows: contrib::flask suite

# Run just the specific test:
scripts/run-tests --venv flask_py311 -- -vv tests/contrib/flask/test_views.py

Example 4: Iterating on a Failing Test

First run shows one test failing:

scripts/run-tests --venv flask_py311 -- -vv -k test_view_called_twice
# Focused on the specific failing test with verbose output

Best Practices

DO ✅

  • Start small: Run 1 venv first, expand only if needed
  • Be specific: Use pytest -k filter when re-running failures
  • Check git: Verify you're testing the right files with git status
  • Read errors: Take time to understand test failures before re-running
  • Ask for help: When unclear what tests to run, ask me to analyze the changes

DON'T ❌

  • Run all venvs initially: That's what CI is for
  • Skip the minimal set guidance: It's designed to save you time
  • Ignore service requirements: Some suites need Docker services up
  • Run tests without changes saved: Make sure edits are saved first
  • Iterate blindly: Understand what's failing before re-running

Additional Testing Resources

For comprehensive testing guidance, refer to the contributing documentation:

  • docs/contributing-testing.rst - Detailed testing guidelines

    • What kind of tests to write (unit tests, integration tests, e2e tests)
    • When to write tests (feature development, bug fixes)
    • Where to put tests in the repository
    • Prerequisites (Docker, uv)
    • Complete scripts/run-tests usage examples
    • Riot environment management details
    • Running specific test files and functions
    • Test debugging strategies
  • docs/contributing.rst - PR and testing requirements

    • All changes need tests or documented testing strategy
    • How tests fit into the PR review process
    • Testing expectations for different types of changes
  • docs/contributing-design.rst - Test architecture context

    • How products, integrations, and core interact
    • Where different types of tests should live
    • Testing patterns for each library component

When to reference these docs:

  • First time writing tests for this project → Read contributing-testing.rst
  • Understanding test requirements for PRs → Read contributing.rst
  • Need context on test architecture → Read contributing-design.rst

Troubleshooting

Docker services won't start

# Manually check/stop services:
docker compose ps
docker compose down

Can't find matching suites

  • Verify the file path is correct
  • Check tests/suitespec.yml to understand suite patterns
  • Your file might not be covered by any suite pattern yet

Test takes too long

  • You may have selected too many venvs
  • Try running with just 1 venv
  • Use pytest -k to run subset of tests

Technical Details

Architecture

The scripts/run-tests system:

  • Maps source files to test suites using patterns in tests/suitespec.yml
  • Uses riot to manage multiple Python/package combinations as venvs
  • Each venv is a self-contained environment
  • Docker services are managed per suite lifecycle
  • Tests can pass optional pytest arguments with --

Supported Suite Types

Primary suites for validation:

  • tracer: Core tracing functionality tests
  • internal: Internal component tests
  • contrib::*: Integration with specific libraries (flask, django, etc.)
  • integration_*: Cross-library integration scenarios
  • Specialized: telemetry, profiling, appsec, llmobs, etc.

Environment Variables

Some suites require environment setup:

  • DD_TRACE_AGENT_URL: For snapshot-based tests
  • Service-specific variables for Docker containers
  • These are handled automatically by the script
Weekly Installs
5
First Seen
Jan 24, 2026
Installed on
gemini-cli4
windsurf4
codex4
opencode4
trae3
antigravity3