python-test-updater
Python Test Updater
Same triage discipline as → java-test-updater. Python differences: no compile-time breaks (everything fails at runtime), more mocker.patch coupling, and snapshot libraries make some updates one command.
Python failure taxonomy
No compile step means everything surfaces at test runtime:
| Failure | Python-specific signal | Action |
|---|---|---|
AttributeError: 'X' has no attribute 'foo' |
Renamed/removed method | Update call site |
TypeError: f() missing 1 positional argument |
Signature changed | Add arg or use default |
TypeError: f() got an unexpected keyword |
Kwarg renamed/removed | Update kwarg |
ImportError / ModuleNotFoundError |
Module moved | Update import |
AssertionError with value diff |
Behavior changed (intentional?) or regression | Triage |
AssertionError: Expected 'mock' to be called |
Over-mocked internal | Loosen/delete mock |
AssertionError in snapshot compare |
Snapshot stale | Review diff, --snapshot-update if intentional |
Automating the mechanical fixes
Python's runtime errors point straight at the problem. For signature changes across many tests:
# conftest.py — one-time shim during migration
# OLD: Order(items, region) NEW: Order(items, region, currency="USD")
# Tests still call Order(items, region). Temporary compat:
@pytest.fixture(autouse=True)
def _order_compat(mocker):
orig_init = Order.__init__
def compat_init(self, items, region, currency="USD"):
orig_init(self, items, region, currency)
mocker.patch.object(Order, "__init__", compat_init)
This is a bridge, not a fix. All tests pass → remove the shim → fix tests one by one as you touch them. Don't leave compat shims in permanently.
Mock coupling — the mocker.patch problem
def test_process(mocker):
mock_validate = mocker.patch("orders.service._validate_order")
mock_save = mocker.patch("orders.service._save_order")
process(order)
mock_validate.assert_called_once_with(order)
mock_save.assert_called_once()
_validate_order was inlined into process. Test fails: Expected '_validate_order' to have been called once. Called 0 times.
The behavior didn't change — the order is still validated — but the test was asserting structure. Delete the mock assertion. The test should assert the outcome of validation:
def test_process_rejects_invalid():
bad = Order(items=[])
with pytest.raises(InvalidOrder, match="empty"):
process(bad)
This survives refactors because it tests what validation does, not that a function named _validate_order was called.
Snapshot updates — review first
If using syrupy / pytest-snapshot:
pytest --snapshot-update
This updates all failing snapshots to current output. Dangerous if any failure is a regression. Workflow:
- Run without
--snapshot-update. Read every diff. - For each: intentional change? Or regression?
- Only after confirming all are intentional:
--snapshot-update. - Commit the snapshot changes with a message explaining why they changed.
Assertion triage — same as Java
assert invoice.total == Decimal("27.80") # fails: actual 27.55
git log -p -- src/pricing.py → "Fix: half-even rounding." Intentional. Update with reason:
# abc123: half-even rounding fix. Was 27.80 (half-up bug).
assert invoice.total == Decimal("27.55")
Versus:
assert len(results) == 5 # fails: actual 4
The change was to a completely different module. Why are there fewer results? Regression. Don't update. Investigate.
pytest.approx drift
assert score == 0.8472819 # now fails: 0.8472820
Last digit changed — floating-point ops reordered. If precision to 7 decimals isn't spec'd, this was over-tight:
assert score == pytest.approx(0.8473, rel=1e-4)
This isn't "loosening to make it pass." It's fixing an over-specific assertion that should never have been that tight.
Do not
- Do not blindly run
--snapshot-update. Review diffs first. A regression in a snapshot looks identical to an intentional change. - Do not fix
mocker.patch("module._private")failures by updating the patch path. You're chasing implementation. Replace with behavioral assertions. - Do not widen
pytest.approxtolerance until the test passes. Ifrel=0.5is what it takes, the test isn't testing anything. - Do not skip investigating assertion failures in "unrelated" tests. Unrelated is where regressions hide.
- Do not leave compat shims in
conftest.pyafter the migration. They mask further API drift.
Output format
## Failing tests
Total: <N> Import/Attribute: <N> Signature: <N> Assertion: <N> Mock: <N> Snapshot: <N>
## Mechanical fixes
| Test | Error | Fix |
| ---- | ----- | --- |
## Mock decoupling
| Test | Over-coupled patch | Replacement assertion |
| ---- | ------------------ | --------------------- |
## Assertion triage
| Test | Old | New | Cause commit | Classification | Action |
| ---- | --- | --- | ------------ | -------------- | ------ |
## Snapshot review
| Snapshot | Diff summary | Intentional? |
| -------- | ------------ | ------------ |
## Regressions
<tests correctly failing — file bugs, don't update>
## After
Passing: <N> Updated: <N> Decoupled: <N> Deleted: <N> Bugs filed: <N>
More from santosomar/general-secure-coding-agent-skills
code-review-assistant
Performs structured code review on a diff or file set, producing inline comments with severity levels and a summary. Checks correctness, error handling, security, and maintainability — in that priority order. Use when reviewing a pull request, when the user asks for a code review, when preparing code for merge, or when a second opinion is needed on a change.
15dependency-resolver
Diagnoses and resolves package dependency conflicts — version mismatches, diamond dependencies, cycles — across npm, pip, Maven, Cargo, and similar ecosystems. Use when install fails with a resolution error, when two packages require incompatible versions of a third, or when upgrading one dependency breaks another.
4configuration-generator
Generates configuration files for services and tools (app config, logging config, linter config, database config) from a brief description of desired behavior, matching the target format's idioms. Use when bootstrapping a new service, when the user asks for a config file for a specific tool, or when translating config intent between formats.
3ci-pipeline-synthesizer
Generates CI pipeline configs by analyzing a repo's structure, language, and build needs — GitHub Actions, GitLab CI, or other platforms. Use when bootstrapping CI for a new repo, when porting from one CI to another, when the user asks for a pipeline that builds and tests their project, or when wiring in security gates.
3api-design-assistant
Reviews and designs API contracts — function signatures, REST endpoints, library interfaces — for usability, evolvability, and the principle of least surprise. Use when designing a new public interface, when reviewing an API PR, when the user asks whether a signature is well-designed, or when planning a breaking change.
2code-refactoring-assistant
Executes refactorings — extract method, inline, rename, move — in small, behavior-preserving steps with a test between each. Use when the user wants to restructure working code, when cleaning up after a feature lands, or when a smell has been identified and needs fixing.
2