mutation-testing
SKILL.md
Mutation Testing Skill
觸發時機
- 單元測試覆蓋率達標(如 > 80%)但仍需確認測試品質時
- 核心演算法或高風險模組開發完成後
- CI/CD 流程中的品質閘門(Quality Gate)階段
- 使用者要求「驗證測試有效性」時
- PR Review 時發現測試僅做 smoke test 而缺乏深度驗證
核心任務
執行變異測試(Mutation Testing),量化測試套件的品質,找出「倖存的變異體(Surviving Mutants)」,並據此強化測試案例。
為什麼需要變異測試?
單元測試覆蓋率(Line Coverage)只能告訴你「程式碼被執行到了」,但無法告訴你「測試是否驗證了正確的行為」。
"Coverage only checks if the code is executed, Mutation Testing checks if the code is verified."
運作原理
- 變異 (Mutate):工具自動修改原始碼的一小部分(例如將
a + b改為a - b,或將return true改為return false),產生一個「變異體 (Mutant)」。 - 測試 (Test):針對這個變異體執行現有的測試套件。
- 判定 (Verdict):
- Killed (殺死):如果測試失敗(紅燈),表示測試成功偵測到了這個變異,這是好事。
- Survived (倖存):如果測試仍然通過(綠燈),表示測試無法偵測這個錯誤,這是壞事。
工具選擇
Java — PITest (PIT)
目前 Java 生態系最成熟的變異測試工具。
<!-- pom.xml configuration -->
<plugin>
<groupId>org.pitest</groupId>
<artifactId>pitest-maven</artifactId>
<version>1.15.0</version>
<dependencies>
<dependency>
<groupId>org.pitest</groupId>
<artifactId>pitest-junit5-plugin</artifactId>
<version>1.2.1</version>
</dependency>
</dependencies>
<configuration>
<targetClasses>
<param>com.yourdomain.core.*</param>
</targetClasses>
<targetTests>
<param>com.yourdomain.core.*Test</param>
</targetTests>
<mutators>
<mutator>STRONGER</mutator>
</mutators>
<outputFormats>
<outputFormat>HTML</outputFormat>
<outputFormat>XML</outputFormat>
</outputFormats>
<timestampedReports>false</timestampedReports>
</configuration>
</plugin>
執行:
mvn pitest:mutationCoverage
# 報告位置:target/pit-reports/index.html
TypeScript / JavaScript — Stryker Mutator
npm install --save-dev @stryker-mutator/core @stryker-mutator/jest-runner @stryker-mutator/typescript-checker
npx stryker init # 互動式建立 stryker.config.mjs
// stryker.config.mjs
/** @type {import('@stryker-mutator/api/core').PartialStrykerOptions} */
export default {
mutate: [
'src/domain/**/*.ts',
'src/application/**/*.ts',
'!src/**/*.spec.ts',
'!src/**/*.test.ts',
],
testRunner: 'jest',
checkers: ['typescript'],
reporters: ['html', 'clear-text', 'progress'],
coverageAnalysis: 'perTest',
thresholds: {
high: 80,
low: 60,
break: 50, // mutation score 低於此值則 CI 失敗
},
};
執行:
npx stryker run
# 報告位置:reports/mutation/html/index.html
Go — Gremlins
go install github.com/go-gremlins/gremlins/cmd/gremlins@latest
# 針對特定 package 執行
gremlins unleash ./internal/domain/...
# 指定測試 timeout(避免無窮迴圈變異體)
gremlins unleash --timeout 30s ./internal/domain/...
# 輸出 JSON 報告
gremlins unleash --output json ./internal/domain/... > mutation-report.json
Rust — cargo-mutants
cargo install cargo-mutants
# 基本執行
cargo mutants
# 針對特定模組
cargo mutants --file src/domain/aggregate.rs
# 排除測試程式碼
cargo mutants --exclude '*_test.rs' --exclude 'tests/*'
# 並行執行(加速)
cargo mutants --jobs 4
Python — mutmut
pip install mutmut
# 執行變異測試
mutmut run --paths-to-mutate src/domain/
# 查看倖存者
mutmut results
# 顯示特定變異的 diff
mutmut show 42
變異算子 (Mutators)
常見的變異類型包括:
| 類別 | 原始碼 | 變異後 | 範例 |
|---|---|---|---|
| Conditionals Boundary | i < 10 |
i <= 10 |
邊界條件錯誤 |
| Negate Conditionals | a == b |
a != b |
邏輯反轉 |
| Math | a + b |
a - b |
算術錯誤 |
| Increments | i++ |
i-- |
方向錯誤 |
| Invert Negatives | -i |
i |
符號錯誤 |
| Return Values | return true |
return false |
回傳值錯誤 |
| Void Method Calls | list.add(x) |
(移除呼叫) | 副作用遺失 |
| Empty Returns | return list |
return [] |
空集合回傳 |
針對 Domain Logic 的高價值變異
以下變異特別適合用於驗證 Domain 層的正確性:
Invariant 驗證:
原始: if (quantity < 0) throw new InvalidQuantityException()
變異: if (quantity <= 0) throw new InvalidQuantityException() ← 邊界條件
變異: (移除整個 if 檢查) ← 遺失守衛
Domain Event 發布:
原始: domainEvents.add(new OrderCreatedEvent(orderId))
變異: (移除 add 呼叫) ← 事件遺失
狀態轉換:
原始: this.status = OrderStatus.CONFIRMED
變異: this.status = OrderStatus.CANCELLED ← 錯誤狀態
分析流程
1. 執行單元測試 (必須全數通過)
↓
2. 選擇目標範圍 (Domain / Application 層)
↓
3. 執行變異測試
↓
4. 產生報告
↓
5. 分析倖存變異體 (Analyze Surviving Mutants)
↓
6. 分類倖存原因 (Missing Assertion / Weak Assertion / Equivalent Mutant)
↓
7. 強化測試案例 (Add/Refine Test Cases)
↓
8. 重複直到 Mutation Score 達標
判定標準
| 指標 | 建議閾值 | 說明 |
|---|---|---|
| Line Coverage | > 85% | 基礎要求 |
| Mutation Score | > 80% | 殺死的變異體 / 總變異體 |
| Test Strength | > 80% | 只計算有被覆蓋到的程式碼的變異分數 |
常見的倖存原因與對策
1. 缺乏斷言 (Missing Assertion)
測試執行了代碼,但沒有檢查結果。
Java 範例:
// BAD: 只執行,不驗證
@Test
void createOrder() {
orderService.createOrder(input);
// 沒有任何 assert!
}
// GOOD: 驗證狀態與事件
@Test
void createOrder_shouldPublishEvent() {
var result = orderService.createOrder(input);
assertThat(result.getStatus()).isEqualTo(OrderStatus.CREATED);
assertThat(result.getDomainEvents())
.hasSize(1)
.first()
.isInstanceOf(OrderCreatedEvent.class);
}
TypeScript 範例:
// BAD
it('should create order', () => {
orderService.createOrder(input);
});
// GOOD
it('should create order and publish event', () => {
const result = orderService.createOrder(input);
expect(result.status).toBe(OrderStatus.CREATED);
expect(result.domainEvents).toHaveLength(1);
expect(result.domainEvents[0]).toBeInstanceOf(OrderCreatedEvent);
});
2. 斷言過於寬鬆 (Weak Assertion)
只檢查了部分狀態(例如只檢查 List 不為空,沒檢查內容)。
Go 範例:
// BAD: 只檢查長度
func TestGetOrders(t *testing.T) {
orders := repo.FindByCustomer(customerID)
assert.NotEmpty(t, orders)
}
// GOOD: 檢查具體屬性
func TestGetOrders(t *testing.T) {
orders := repo.FindByCustomer(customerID)
assert.Len(t, orders, 2)
assert.Equal(t, "ORD-001", orders[0].ID)
assert.Equal(t, OrderStatusActive, orders[0].Status)
}
3. 邊界條件未涵蓋 (Boundary Not Covered)
變異算子修改邊界(如 < → <=),但測試只用了中間值。
Rust 範例:
// Domain rule: quantity must be between 1 and 999
impl OrderItem {
pub fn new(quantity: u32) -> Result<Self, DomainError> {
if quantity < 1 || quantity > 999 {
return Err(DomainError::InvalidQuantity);
}
Ok(Self { quantity })
}
}
// BAD: 只測中間值
#[test]
fn test_valid_quantity() {
assert!(OrderItem::new(10).is_ok());
}
// GOOD: 測邊界值
#[test]
fn test_quantity_boundaries() {
// 下界
assert!(OrderItem::new(0).is_err()); // < 1 → 拒絕
assert!(OrderItem::new(1).is_ok()); // == 1 → 接受
// 上界
assert!(OrderItem::new(999).is_ok()); // == 999 → 接受
assert!(OrderItem::new(1000).is_err()); // > 999 → 拒絕
}
4. 等價變異 (Equivalent Mutant)
變異後的代碼在邏輯上與原代碼等價,不可能被殺死。
// 原始碼
for (int i = 0; i < list.size(); i++) { ... }
// 等價變異:改成 != 在此場景下行為相同
for (int i = 0; i != list.size(); i++) { ... }
- 對策:標記為 False Positive 或忽略。大多數工具會回報這些為
no_coverage或survived,需人工判斷。
5. Void Method 副作用未驗證 (Side Effect Not Verified)
// BAD: repository.save() 被移除也不會被偵測到
it('should update aggregate', () => {
service.updateOrder(command);
// 沒有 verify save 被呼叫
});
// GOOD: 驗證副作用
it('should persist updated aggregate', () => {
service.updateOrder(command);
expect(mockRepo.save).toHaveBeenCalledWith(
expect.objectContaining({ id: orderId, status: 'UPDATED' })
);
});
與 Problem Frames 的整合
針對 Frame Type 的變異策略
| Frame Type | 重點變異目標 | 關鍵測試補強 |
|---|---|---|
| CBF (Command) | Aggregate invariants, Use Case pre/post-conditions, Domain Events | 驗證狀態轉換完整性、事件發布 |
| IDF (Query) | Read Model mapping, Caching logic, Filter conditions | 驗證資料轉換正確性、快取失效 |
| RIF (Reactor) | Idempotency checks, Retry logic, Event handler conditions | 驗證冪等性、重試策略、DLQ 處理 |
範例:CBF Aggregate 變異分析
# 待測 Aggregate: Order
mutation_analysis:
target: src/domain/order/Order.java
frame_type: CommandedBehaviorFrame
focus_areas:
- name: "Invariant enforcement"
mutators: [CONDITIONALS_BOUNDARY, NEGATE_CONDITIONALS, REMOVE_CONDITIONALS]
expected_kill_rate: "> 95%"
- name: "State transitions"
mutators: [RETURN_VALUES, VOID_METHOD_CALLS]
expected_kill_rate: "> 90%"
- name: "Domain event publishing"
mutators: [VOID_METHOD_CALLS, EMPTY_RETURNS]
expected_kill_rate: "> 90%"
- name: "Pre-condition checks"
mutators: [NEGATE_CONDITIONALS, REMOVE_CONDITIONALS]
expected_kill_rate: "> 95%"
CI/CD 整合
GitHub Actions 範例
# .github/workflows/mutation-test.yml
name: Mutation Testing
on:
pull_request:
paths:
- 'src/domain/**'
- 'src/application/**'
jobs:
mutation-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run unit tests first
run: mvn test
- name: Run mutation testing
run: mvn pitest:mutationCoverage
- name: Check mutation score threshold
run: |
# PIT mutations.xml has per-mutation results; calculate score from totals
KILLED=$(grep -c 'status="KILLED"' target/pit-reports/mutations.xml) || KILLED=0
TOTAL=$(grep -c '<mutation ' target/pit-reports/mutations.xml) || TOTAL=0
if [ "$TOTAL" -eq 0 ]; then
echo "No mutations found"; exit 1
fi
SCORE=$(( KILLED * 100 / TOTAL ))
if [ "$SCORE" -lt 80 ]; then
echo "Mutation score $SCORE% is below threshold (80%)"
exit 1
fi
- name: Upload mutation report
if: always()
uses: actions/upload-artifact@v4
with:
name: mutation-report
path: target/pit-reports/
TypeScript (Stryker) CI
jobs:
mutation-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm ci
- run: npm test
- run: npx stryker run
# stryker.config.mjs 中的 thresholds.break 會控制 exit code
交付產物
執行此 Skill 後,應產出:
- Mutation Report Analysis:分析主要倖存原因,按 Frame Concern 歸類。
- Improved Test Suite:新增或修改的測試案例,能殺死之前的倖存者。
- Confidence Assessment:對該模組品質的信心評估。
- Surviving Mutant Summary:記錄無法殺死的等價變異,附上理由。
輸出格式範例
## Mutation Testing Report: Order Aggregate
### Summary
| Metric | Value |
|--------|-------|
| Total Mutants | 127 |
| Killed | 108 |
| Survived | 14 |
| Equivalent (ignored) | 5 |
| Mutation Score | 88.5% |
### Surviving Mutants (by priority)
1. **INV-001: Quantity boundary** (Conditionals Boundary)
- Location: `Order.java:45`
- Mutation: `quantity < 1` → `quantity <= 1`
- Impact: High — allows zero quantity orders
- Fix: Add boundary test for `quantity = 0` and `quantity = 1`
2. **EVT-001: Event not published** (Void Method Call)
- Location: `Order.java:72`
- Mutation: removed `addDomainEvent(new OrderConfirmedEvent(...))`
- Impact: High — silent event loss
- Fix: Add assertion verifying event list after confirm()
品質閘門 Checklist
- 單元測試全部通過
- Mutation Score >= 80%
- Domain 層 (Entities, Value Objects) Mutation Score >= 90%
- 所有 Invariant 相關的變異體都被殺死
- Domain Event 發布的變異體都被殺死
- 倖存的等價變異體已標記並說明理由
- 新增的測試遵循
coding-standards規範
注意:變異測試極其耗時。建議只針對核心 Domain Logic (Entities, Value Objects, Domain Services) 執行,避免對整個專案執行。使用
coverageAnalysis: 'perTest'(Stryker)或 PITest 的targetClasses來限縮範圍。
Weekly Installs
2
Repository
knowlet/pf-skillsGitHub Stars
2
First Seen
Today
Security Audits
Installed on
mcpjam2
claude-code2
kilo2
replit2
junie2
windsurf2