test-oracle-generator
Test Oracle Generator
Generate automated test oracles to verify correct software behavior across multiple oracle types.
Oracle Types
- Assertion-based: Compare actual vs expected output
- Property-based: Verify invariants that should always hold
- Differential: Compare new implementation against reference
- Metamorphic: Test input-output transformations
Workflow
Step 1: Analyze the Function Under Test
Understand what the function does and its expected behavior.
Checklist:
- Read function signature and docstring
- Identify input parameters and types
- Identify return type and possible values
- Note preconditions and postconditions
- Check for edge cases (empty input, null, boundary values)
Example Analysis:
# Python
def calculate_discount(price: float, discount_percent: int) -> float:
"""Calculate discounted price.
Args:
price: Original price (must be positive)
discount_percent: Discount percentage (0-100)
Returns:
Discounted price
"""
return price * (1 - discount_percent / 100)
Analysis:
- Inputs:
price(float, must be > 0),discount_percent(int, 0-100) - Output: float (discounted price)
- Invariants: Result should be ≤ original price, result should be ≥ 0
- Edge cases: 0% discount, 100% discount, boundary values
Step 2: Generate Assertion-Based Oracles
Create explicit expected value assertions for common cases.
Template:
# Python (pytest)
def test_<function>_<scenario>():
# Arrange
input1 = <value>
input2 = <value>
expected = <calculated_expected_value>
# Act
actual = function_under_test(input1, input2)
# Assert
assert actual == expected, f"Expected {expected}, got {actual}"
// Java (JUnit)
@Test
public void test<Function><Scenario>() {
// Arrange
Type input1 = <value>;
Type input2 = <value>;
Type expected = <calculated_expected_value>;
// Act
Type actual = functionUnderTest(input1, input2);
// Assert
assertEquals(expected, actual, "Expected and actual should match");
}
Example:
def test_calculate_discount_50_percent():
# Arrange
price = 100.0
discount = 50
expected = 50.0
# Act
actual = calculate_discount(price, discount)
# Assert
assert actual == expected
assert abs(actual - expected) < 0.01 # For floating point
Generate oracles for:
- ✓ Typical cases (middle of valid range)
- ✓ Boundary values (min/max)
- ✓ Edge cases (empty, zero, null)
- ✓ Special values (negative, infinity for numeric types)
Step 3: Generate Property-Based Oracles
Identify invariants and properties that should always hold.
Common Properties:
- Range properties: Output within expected range
- Relationship properties: Output relates to input in specific way
- Conservation properties: Something is preserved (e.g., list length)
- Idempotence: Applying function twice gives same result as once
- Commutativity: Order of inputs doesn't matter
- Associativity: Grouping doesn't matter
Template (Python with hypothesis):
from hypothesis import given, strategies as st
@given(st.floats(min_value=0.01, max_value=10000),
st.integers(min_value=0, max_value=100))
def test_discount_properties(price, discount_percent):
result = calculate_discount(price, discount_percent)
# Property: Result should never exceed original price
assert result <= price, "Discount should not increase price"
# Property: Result should be non-negative
assert result >= 0, "Price cannot be negative"
# Property: 0% discount returns original price
if discount_percent == 0:
assert abs(result - price) < 0.01
# Property: 100% discount returns 0
if discount_percent == 100:
assert abs(result) < 0.01
Template (Java with JUnit Theories):
@Theory
public void discountProperties(
@ForAll @InRange(min = "0.01", max = "10000") double price,
@ForAll @InRange(min = "0", max = "100") int discountPercent) {
double result = calculateDiscount(price, discountPercent);
// Property: Result should never exceed original price
assertTrue(result <= price, "Discount should not increase price");
// Property: Result should be non-negative
assertTrue(result >= 0, "Price cannot be negative");
}
Identify properties by asking:
- What can I say about the output without computing it exactly?
- What relationships must hold between input and output?
- What constraints must the output satisfy?
- What should never happen?
For detailed property patterns, see references/property_patterns.md.
Step 4: Generate Differential Oracles
Compare new implementation against reference implementation.
Use Cases:
- Refactoring: New optimized version vs old version
- Migration: New library/language vs legacy system
- Bug fixes: Patched version vs unpatched version
Template:
# Python
def test_new_vs_legacy_implementation():
# Test data
test_cases = [
(100.0, 10),
(50.0, 25),
(200.0, 0),
(75.0, 100),
]
for price, discount in test_cases:
# Compare outputs
legacy_result = legacy_calculate_discount(price, discount)
new_result = calculate_discount(price, discount)
assert abs(legacy_result - new_result) < 0.01, \
f"Mismatch for ({price}, {discount}): " \
f"legacy={legacy_result}, new={new_result}"
// Java
@Test
public void testNewVsLegacyImplementation() {
Object[][] testCases = {
{100.0, 10},
{50.0, 25},
{200.0, 0},
{75.0, 100}
};
for (Object[] testCase : testCases) {
double price = (double) testCase[0];
int discount = (int) testCase[1];
double legacyResult = LegacyClass.calculateDiscount(price, discount);
double newResult = calculateDiscount(price, discount);
assertEquals(legacyResult, newResult, 0.01,
String.format("Mismatch for (%f, %d)", price, discount));
}
}
Best Practices:
- Generate diverse test data (random, boundary, edge cases)
- Include both typical and unusual inputs
- Log differences for debugging
- Consider performance differences acceptable
Step 5: Generate Metamorphic Oracles
Create test pairs where input transformation produces predictable output transformation.
Metamorphic Relations:
- Additive: f(x) + f(y) = f(x + y)
- Multiplicative: f(k × x) = k × f(x)
- Permutation: f(permute(x)) = permute(f(x))
- Subset: f(subset(x)) ⊆ f(x)
- Inverse: f(f⁻¹(x)) = x
Example for discount function:
# Python
def test_discount_metamorphic_double_price():
"""If price doubles, discount amount doubles."""
price = 100.0
discount_percent = 20
result1 = calculate_discount(price, discount_percent)
result2 = calculate_discount(price * 2, discount_percent)
discount_amount1 = price - result1
discount_amount2 = (price * 2) - result2
# Metamorphic relation: doubling price doubles discount amount
assert abs(discount_amount2 - 2 * discount_amount1) < 0.01
def test_discount_metamorphic_additive():
"""Applying discount to sum equals sum of individual discounts."""
price1 = 50.0
price2 = 30.0
discount_percent = 15
# Method 1: Discount on combined price
combined_result = calculate_discount(price1 + price2, discount_percent)
# Method 2: Sum of individual discounts
individual_sum = (calculate_discount(price1, discount_percent) +
calculate_discount(price2, discount_percent))
# Metamorphic relation: Should be equivalent
assert abs(combined_result - individual_sum) < 0.01
Example for sorting function:
def test_sort_metamorphic_reverse():
"""Reversing then sorting gives same result as sorting."""
input_list = [3, 1, 4, 1, 5, 9, 2, 6]
result1 = sort(input_list)
result2 = sort(list(reversed(input_list)))
assert result1 == result2
def test_sort_metamorphic_duplicate():
"""Sorting list with duplicated elements maintains order."""
input_list = [3, 1, 4]
duplicated = input_list + input_list
result = sort(duplicated)
# Should be sorted version of original, doubled
expected = sorted(input_list) + sorted(input_list)
assert result == sorted(expected)
For more metamorphic relation patterns, see references/metamorphic_patterns.md.
Step 6: Combine Oracles for Comprehensive Testing
Use multiple oracle types together for robust verification.
Example: Complete test suite for calculate_discount:
import pytest
from hypothesis import given, strategies as st
# Assertion-based oracles
class TestDiscountAssertions:
def test_50_percent_discount(self):
assert calculate_discount(100.0, 50) == 50.0
def test_no_discount(self):
assert calculate_discount(100.0, 0) == 100.0
def test_full_discount(self):
assert calculate_discount(100.0, 100) == 0.0
# Property-based oracles
class TestDiscountProperties:
@given(st.floats(min_value=0.01, max_value=10000),
st.integers(min_value=0, max_value=100))
def test_result_within_bounds(self, price, discount):
result = calculate_discount(price, discount)
assert 0 <= result <= price
@given(st.floats(min_value=0.01, max_value=10000),
st.integers(min_value=0, max_value=100))
def test_monotonic_in_discount(self, price, discount):
"""Higher discount percentage means lower price."""
if discount < 100:
result1 = calculate_discount(price, discount)
result2 = calculate_discount(price, discount + 1)
assert result2 <= result1
# Differential oracles
class TestDiscountDifferential:
@pytest.mark.parametrize("price,discount", [
(100.0, 10), (50.0, 25), (200.0, 50)
])
def test_vs_manual_calculation(self, price, discount):
result = calculate_discount(price, discount)
expected = price * (1 - discount / 100)
assert abs(result - expected) < 0.01
# Metamorphic oracles
class TestDiscountMetamorphic:
def test_double_price_doubles_discount_amount(self):
price = 100.0
discount = 20
discount_amt1 = price - calculate_discount(price, discount)
discount_amt2 = (price * 2) - calculate_discount(price * 2, discount)
assert abs(discount_amt2 - 2 * discount_amt1) < 0.01
Step 7: Document and Validate Oracles
Ensure oracles are correct and well-documented.
Oracle Documentation Template:
def test_function_oracle_type():
"""Brief description of what this oracle verifies.
Oracle Type: [Assertion-based|Property-based|Differential|Metamorphic]
Rationale: Explain why this property/assertion should hold.
Edge Cases Covered:
- Case 1
- Case 2
"""
# Test implementation
pass
Validation Checklist:
- Oracle passes on correct implementation
- Oracle fails on intentionally broken implementation (mutation testing)
- Oracle is deterministic (same input → same result)
- Oracle is independent (doesn't rely on other test state)
- Oracle has clear failure messages
- Oracle is documented with rationale
Mutation Testing (validate oracle effectiveness):
# Introduce deliberate bug to verify oracle catches it
def calculate_discount_buggy(price, discount_percent):
# BUG: Wrong formula
return price * discount_percent / 100 # Should be: price * (1 - discount_percent / 100)
# Oracle should fail on buggy version
def test_oracle_detects_bug():
"""Verify oracle catches the bug."""
with pytest.raises(AssertionError):
assert calculate_discount_buggy(100, 50) == 50.0 # This should fail
Oracle Selection Guide
Choose Assertion-based when:
- Expected output is easily computable
- Testing specific known scenarios
- Regression testing with saved examples
Choose Property-based when:
- Expected output is hard to compute but properties are clear
- Want to test many random inputs
- Testing invariants that should always hold
Choose Differential when:
- Refactoring or optimizing existing code
- Migrating to new implementation
- Reference implementation exists
Choose Metamorphic when:
- Expected output is unknown or hard to compute
- No reference implementation available
- Want to test complex transformations
Tips
- Start simple: Begin with assertion-based oracles, add others as needed
- Combine oracles: Use multiple types for robust verification
- Test the tests: Use mutation testing to validate oracle effectiveness
- Document assumptions: Explain why properties should hold
- Handle floating point: Use tolerance for float comparisons
- Consider performance: Property-based tests run many iterations
- Focus on important properties: Not every function needs all oracle types
Common Patterns
For detailed oracle patterns organized by domain, see:
- references/property_patterns.md - Common property-based oracle patterns
- references/metamorphic_patterns.md - Metamorphic relation catalog