test-guided-debloating
Test-Guided Debloating
Remove unnecessary code from a repository while preserving exactly the behavior exercised by a given test suite.
Workflow
1. Understand Requirements
Inputs needed:
- Target repository path
- Test command to run the test suite
- Programming language/framework
Clarify scope:
- Which tests define required behavior (all tests, specific test files, integration tests only)?
- Should test files themselves be preserved (yes, never modify tests)?
- Are there any files that must be kept regardless of coverage (configuration, documentation)?
2. Run Coverage Analysis
Generate test coverage data to identify exercised code. See coverage_tools.md for language-specific commands.
Python example:
pytest --cov=. --cov-report=json --cov-report=html
JavaScript example:
npm test -- --coverage --coverageReporters=json --coverageReporters=html
Java example:
mvn clean test jacoco:report
Verify:
- Coverage report generated successfully
- All tests pass
- Coverage data includes all source files
3. Analyze Coverage Data
Use scripts/analyze_coverage.py to identify removal candidates:
python scripts/analyze_coverage.py coverage.json --output analysis.json
The script identifies:
- Uncovered files (0% coverage) - safest to remove
- Partially covered files - require manual review
- Uncovered lines - potential dead code within files
Manual review checklist:
- Are uncovered files truly unused (check imports)?
- Do uncovered functions have side effects?
- Are there dynamic imports or reflection?
- Is this a library with public API requirements?
4. Incremental Removal
Remove code incrementally, validating after each step. See debloating_strategy.md for detailed strategy.
Removal order (safest to riskiest):
-
Unused imports
# Remove unused import statements # Run tests after removal -
Uncovered files (0% coverage, not imported)
# Remove file rm path/to/unused_file.py # Run tests pytest -
Uncovered functions/methods
# Remove function definition # Run tests -
Dead branches
# Before: if condition that's always true if always_true_condition: do_something() else: never_executed() # Remove this branch # After: do_something() -
Unused classes
# Remove class definition if never instantiated # Run tests
After each removal:
# Run full test suite
<test_command>
# Verify all tests pass
# Check for import errors
# Verify build succeeds
5. Validate Preservation of Behavior
Final validation:
- All tests pass
- Build succeeds without errors
- No import/module errors
- No runtime errors during test execution
- Test coverage of remaining code unchanged
- No modifications to test files
Run tests multiple times to catch flaky tests or timing issues.
6. Document Changes
Generate removal report:
DEBLOATING SUMMARY
==================
Removed Elements:
- 15 unused files
- 42 unused functions
- 8 unused classes
- 156 unused imports
- 23 dead branches
Total lines removed: 3,847
Tests passing: 156/156
Preservation Justification:
All test-defined behavior is preserved because:
1. All test-covered code remains intact
2. All transitive dependencies of test-covered code remain
3. No side effects required by tests were removed
4. Build succeeds and all 156 tests pass
Key Principles
Test-Defined Behavior: The test suite is the single source of truth for required functionality.
Conservative Removal: When in doubt, keep the code. Only remove code you're confident is unused.
Incremental Validation: Remove code in small batches and run tests after each change.
Never Modify Tests: Test files define the required behavior and must not be changed.
Preserve Side Effects: Be cautious with code that has side effects (logging, initialization, registration).
Common Pitfalls
Avoid removing:
- Code with side effects (module-level initialization, decorators, metaclasses)
- Code used via reflection or dynamic imports
- Public API methods (if building a library)
- Configuration files
- Error handling needed in production (even if untested)
Watch for:
- Dynamic behavior (eval, exec, importlib, reflection)
- Transitive dependencies (code called by covered code)
- Module-level code execution
- Static initializers
- Decorator/annotation side effects
Helper Script
The analyze_coverage.py script automates coverage analysis:
# Analyze Python coverage
python scripts/analyze_coverage.py coverage.json
# Analyze JavaScript coverage
python scripts/analyze_coverage.py coverage-final.json --format javascript
# Save detailed analysis
python scripts/analyze_coverage.py coverage.json --output analysis.json
Output includes:
- List of uncovered files (safe to remove)
- Partially covered files (review needed)
- Uncovered line ranges
- Safety ratings for removal candidates