run-tests
Test Suite Validation Skill
This skill helps you efficiently validate code changes by running the appropriate subset of the test suite. It uses scripts/run-tests to intelligently discover affected tests and run only what's necessary for validation.
When to Use This Skill
Use this skill when you have:
- Made changes to source code files and want to validate they work
- Fixed a bug and want to verify the fix
- Added a feature and need test coverage
- Modified test infrastructure or configuration
- Want to verify changes don't break existing functionality
Key Principles
- Always use the run-tests skill when testing code changes - it's optimized for intelligent suite discovery
- Never run pytest directly - bypasses the project's test infrastructure (use
scripts/run-testsorriotviascripts/ddtest) - Minimal venvs for iteration - run 1-2 venvs initially, expand only if needed
- Use
--dry-runfirst - see what would run before executing - Follow official docs -
docs/contributing-testing.rstis the source of truth for testing procedures
How This Skill Works
Step 1: Identify Changed Files
First, determine which files were modified:
- If you have pending edits, I'll identify the changed files from the current session
- I'll look at git status to find staged, unstaged, and untracked changes
- You can also specify files explicitly if working on specific changes
Step 2: Discover Available Test Suites
I'll use the scripts/run-tests script to discover what test suites match your changes:
scripts/run-tests --list <edited-files>
This outputs JSON showing:
- Available test suites that match your changed files
- All venvs (Python versions + package combinations) available for each suite
- Their hashes, Python versions, and package versions
Step 3: Intelligently Select Venvs
Rather than running ALL available venvs (which could take hours), I'll select the minimal set needed to validate your changes:
For Core/Tracing Changes (Broad Impact)
When you modify files like:
ddtrace/internal/core/*,ddtrace/_trace/*,ddtrace/trace/*ddtrace/_monkey.py,ddtrace/settings/*ddtrace/constants.py
Strategy: Run core tracer + internal tests with 1 venv each
- Example:
tracersuite with latest Python +internalsuite with latest Python - This validates broad-reaching changes without excessive overhead
- Skip integration suites unless the change directly affects integration code
For Integration/Contrib Changes (Targeted Impact)
When you modify files like:
ddtrace/contrib/flask/*,ddtrace/contrib/django/*, etc.ddtrace/contrib/*/patch.pyor integration-specific code
Strategy: Run ONLY the affected integration suite with 1-2 venvs
- Example: Flask changes → run
contrib::flasksuite with latest Python - If change involves multiple versions (e.g., Django 3.x and 4.x), pick 1 venv per major version
- Skip unrelated integrations
For Test-Only Changes
When you modify tests/ files (but not test infrastructure):
- Run only the specific test files/functions modified
- Use pytest args:
-- -k test_nameor direct test file paths
For Test Infrastructure Changes
When you modify:
tests/conftest.py,tests/suitespec.yml,scripts/run-tests,riotfile.py
Strategy: Run a quick smoke test suite
- Example:
internalsuite with 1 venv as a sanity check - Or run small existing test suites to verify harness changes
Step 4: Execute Selected Venvs
I'll run the selected venvs using:
scripts/run-tests --venv <hash1> --venv <hash2> ...
This will:
- Start required Docker services (redis, postgres, etc.)
- Run tests in the specified venvs sequentially
- Stop services after completion
- Show real-time output and status
Step 5: Handle Results
If tests pass: ✅ Your changes are validated!
If tests fail: 🔴 I'll:
- Show you the failure details
- Identify which venv failed
- Ask clarifying questions to understand the issue
- Offer to run specific failing tests with more verbosity
- Help iterate on fixes and re-run
For re-running specific tests:
scripts/run-tests --venv <hash> -- -vv -k test_name
When Tests Fail
When you encounter test failures, follow this systematic approach:
- Review the failure details carefully - Don't just skim the error, understand what's actually failing
- Understand what's failing - Don't blindly re-run; analyze the root cause
- Make code changes - Fix the underlying issue
- Re-run with more verbosity if needed - Use
-vvor-vvvfor detailed output - Iterate until tests pass - Repeat the process with each fix
Venv Selection Strategy in Detail
Understanding Venv Hashes
From scripts/run-tests --list, you'll see output like:
{
"suites": [
{
"name": "tracer",
"venvs": [
{
"hash": "abc123",
"python_version": "3.8",
"packages": "..."
},
{
"hash": "def456",
"python_version": "3.11",
"packages": "..."
}
]
}
]
}
Selection Rules
-
Latest Python version is your default choice
- Unless your change specifically targets an older Python version
- Example: if fixing Python 3.8 compatibility, also test 3.8
-
One venv per suite is usually enough for iteration
- Only run multiple venvs per suite if:
- Change impacts multiple Python versions differently
- Testing package compatibility variations (e.g., Django 3.x vs 4.x)
- Initial validation passed and you want broader coverage
- Only run multiple venvs per suite if:
-
Minimize total venvs
- 1-2 venvs total for small targeted changes
- 3-4 venvs maximum for broader changes
- Never run 10+ venvs for initial validation (save that for CI)
-
Consider test runtime
- Each venv can take 5-30 minutes depending on suite
- With 2 venvs you're looking at 10-60 minutes for iteration
- With 5 venvs you're looking at 25-150 minutes
- Scale appropriately for your patience and deadline
Using --venv Directly
When you have a specific venv hash you want to run, you can use it directly without specifying file paths:
scripts/run-tests --venv e06abee
The --venv flag automatically searches all available venvs across all suites, so it works regardless of what files you have locally changed. This is useful when:
- You know exactly which venv you want to test
- You have unrelated local changes that would otherwise limit suite matching
- You want to quickly re-run a specific venv without file path arguments
Examples
Example 1: Fixing a Flask Integration Bug
Changed file: ddtrace/contrib/internal/flask/patch.py
scripts/run-tests --list ddtrace/contrib/internal/flask/patch.py
# Output shows: contrib::flask suite available
# Select output (latest Python):
# Suite: contrib::flask
# Venv: hash=e06abee, Python 3.13, flask
# Run with --venv directly (searches all venvs automatically)
scripts/run-tests --venv e06abee
# Runs just Flask integration tests
Example 2: Fixing a Core Tracing Issue
Changed file: ddtrace/_trace/tracer.py
scripts/run-tests --list ddtrace/_trace/tracer.py
# Output shows: tracer suite, internal suite available
# Select strategy:
# - tracer: latest Python (e.g., abc123)
# - internal: latest Python (e.g., def456)
# Run with --venv directly (searches all venvs automatically)
scripts/run-tests --venv abc123 --venv def456
# Validates core tracer and internal components
Example 3: Fixing a Test-Specific Bug
Changed file: tests/contrib/flask/test_views.py
scripts/run-tests --list tests/contrib/flask/test_views.py
# Output shows: contrib::flask suite
# Run just the specific test:
scripts/run-tests --venv flask_py311 -- -vv tests/contrib/flask/test_views.py
Example 4: Iterating on a Failing Test
First run shows one test failing:
scripts/run-tests --venv flask_py311 -- -vv -k test_view_called_twice
# Focused on the specific failing test with verbose output
Best Practices
DO ✅
- Start small: Run 1 venv first, expand only if needed
- Be specific: Use pytest
-kfilter when re-running failures - Check git: Verify you're testing the right files with
git status - Read errors: Take time to understand test failures before re-running
- Ask for help: When unclear what tests to run, ask me to analyze the changes
DON'T ❌
- Run all venvs initially: That's what CI is for
- Skip the minimal set guidance: It's designed to save you time
- Ignore service requirements: Some suites need Docker services up
- Run tests without changes saved: Make sure edits are saved first
- Iterate blindly: Understand what's failing before re-running
Additional Testing Resources
For comprehensive testing guidance, refer to the contributing documentation:
-
docs/contributing-testing.rst - Detailed testing guidelines
- What kind of tests to write (unit tests, integration tests, e2e tests)
- When to write tests (feature development, bug fixes)
- Where to put tests in the repository
- Prerequisites (Docker, uv)
- Complete
scripts/run-testsusage examples - Riot environment management details
- Running specific test files and functions
- Test debugging strategies
-
docs/contributing.rst - PR and testing requirements
- All changes need tests or documented testing strategy
- How tests fit into the PR review process
- Testing expectations for different types of changes
-
docs/contributing-design.rst - Test architecture context
- How products, integrations, and core interact
- Where different types of tests should live
- Testing patterns for each library component
When to reference these docs:
- First time writing tests for this project → Read
contributing-testing.rst - Understanding test requirements for PRs → Read
contributing.rst - Need context on test architecture → Read
contributing-design.rst
Troubleshooting
Docker services won't start
# Manually check/stop services:
docker compose ps
docker compose down
Can't find matching suites
- Verify the file path is correct
- Check
tests/suitespec.ymlto understand suite patterns - Your file might not be covered by any suite pattern yet
Test takes too long
- You may have selected too many venvs
- Try running with just 1 venv
- Use pytest
-kto run subset of tests
Technical Details
Architecture
The scripts/run-tests system:
- Maps source files to test suites using patterns in
tests/suitespec.yml - Uses
riotto manage multiple Python/package combinations as venvs - Each venv is a self-contained environment
- Docker services are managed per suite lifecycle
- Tests can pass optional pytest arguments with
--
Supported Suite Types
Primary suites for validation:
tracer: Core tracing functionality testsinternal: Internal component testscontrib::*: Integration with specific libraries (flask, django, etc.)integration_*: Cross-library integration scenarios- Specialized:
telemetry,profiling,appsec,llmobs, etc.
Environment Variables
Some suites require environment setup:
DD_TRACE_AGENT_URL: For snapshot-based tests- Service-specific variables for Docker containers
- These are handled automatically by the script