skills/arabelatso/skills-4-se/counterexample-debugger

counterexample-debugger

SKILL.md

Counterexample-Guided Proof Debugger

Analyze counterexamples from Nitpick or QuickChick to explain proof failures and suggest corrections to specifications or proofs.

Workflow

1. Receive Counterexample Information

Identify what information is provided:

  • Counterexample output: From Nitpick or QuickChick
  • Failed theorem: The statement that couldn't be proven
  • Proof attempt: Any partial proof or tactics tried
  • Context: Definitions and lemmas involved

2. Choose Target System

Determine which proof assistant is being used:

  • Isabelle/HOL with Nitpick: Finite model finder
  • Coq with QuickChick: Property-based testing
  • Both: Provide analysis for both systems

3. Analyze the Counterexample

Examine the counterexample systematically:

Verify the counterexample:

  • Manually compute the result for the given values
  • Confirm it actually violates the theorem
  • Check if it's a genuine counterexample or tool limitation

Identify the violation:

  • Which part of the theorem fails?
  • What values cause the failure?
  • Is it an edge case or fundamental issue?

Determine the root cause:

  • Missing precondition?
  • Incorrect specification?
  • Wrong quantifier order?
  • Implementation bug?
  • Off-by-one error?
  • Type constraint issue?

4. Explain the Failure

Provide clear explanation:

What went wrong:

  • Describe why the counterexample violates the theorem
  • Show the computation step-by-step
  • Highlight the specific point of failure

Why it happened:

  • Explain the underlying cause
  • Identify the conceptual error
  • Note any common patterns (empty list, boundary values, etc.)

Impact assessment:

  • Is the theorem fundamentally wrong?
  • Does it need preconditions?
  • Is the specification incomplete?

5. Suggest Corrections

Provide actionable fixes based on the root cause:

For missing preconditions:

(* Before *)
lemma "hd xs ∈ set xs"

(* After *)
lemma "xs ≠ [] ⟹ hd xs ∈ set xs"

For incorrect specifications:

(* Before: uses < instead of <= *)
x < y && is_sorted (y :: ys)

(* After *)
x <= y && is_sorted (y :: ys)

For quantifier order:

(* Before *)
"∃y. ∀x. P x y"

(* After *)
"∀x. ∃y. P x y"

For incomplete specifications:

(* Before: only checks sortedness *)
is_sorted (sort l)

(* After: also checks permutation *)
is_sorted (sort l) && permutation l (sort l)

6. Recommend Next Steps

Guide the user on what to do:

Retest with fix:

  • Run Nitpick/QuickChick again
  • Verify no counterexample found
  • Check if fix is sufficient

Identify additional issues:

  • Are there other edge cases?
  • Do other lemmas need fixing?
  • Is the specification now complete?

Proceed with proof:

  • If no counterexample, attempt proof
  • Suggest proof strategy
  • Identify needed helper lemmas

Counterexample Analysis Patterns

Pattern 1: Empty Structures

Symptom: Counterexample is [], {}, or None

Common causes:

  • Missing non-empty precondition
  • Undefined behavior on empty input
  • Base case not handled

Fix: Add precondition or handle empty case explicitly

Pattern 2: Boundary Values

Symptom: Counterexample is 0, 1, or type limits

Common causes:

  • Off-by-one errors
  • Boundary condition not considered
  • Edge case in arithmetic

Fix: Adjust bounds or add special case handling

Pattern 3: Duplicate Elements

Symptom: Counterexample has repeated values like [0, 0]

Common causes:

  • Using < instead of
  • Assuming distinctness
  • Set vs. multiset confusion

Fix: Use appropriate comparison or add distinctness assumption

Pattern 4: Small Counterexamples

Symptom: Very small counterexample (1-2 elements)

Common causes:

  • Fundamental specification error
  • Wrong base case
  • Incorrect inductive step

Fix: Review base definitions and inductive structure

Pattern 5: Type-Specific Values

Symptom: Counterexample at type boundaries

Common causes:

  • Type constraints not considered
  • Overflow/underflow issues
  • Finite vs. infinite types

Fix: Add type constraints or adjust specification

Tool-Specific Guidance

Nitpick (Isabelle/HOL)

For detailed Nitpick usage and interpretation:

Key points:

  • Searches for finite models
  • Configurable cardinality bounds
  • May miss counterexamples beyond bounds
  • "No counterexample" ≠ proof

QuickChick (Coq)

For detailed QuickChick usage and interpretation:

Key points:

  • Random testing with shrinking
  • Configurable test count
  • May miss rare counterexamples
  • "Success" ≠ proof

Common Root Causes

Specification Errors

Symptoms:

  • Counterexample shows spec doesn't match intent
  • Multiple unrelated counterexamples
  • Spec too weak or too strong

Fixes:

  • Strengthen postconditions
  • Add completeness requirements
  • Review specification against intent

Missing Preconditions

Symptoms:

  • Counterexample is edge case
  • Empty structures or boundary values
  • Undefined behavior

Fixes:

  • Add non-empty constraints
  • Add type bounds
  • Add well-formedness conditions

Quantifier Issues

Symptoms:

  • Counterexample shows wrong order
  • Existential/universal confusion
  • Skolem constant issues

Fixes:

  • Swap quantifier order
  • Review logical structure
  • Check variable dependencies

Implementation Bugs

Symptoms:

  • Counterexample shows function doesn't work
  • Output doesn't match specification
  • Logic error in definition

Fixes:

  • Fix the implementation
  • Review algorithm correctness
  • Test implementation separately

Debugging Checklist

When analyzing a counterexample:

  1. Verify manually: Compute the result for the counterexample values
  2. Identify violation: Which part of the theorem fails?
  3. Find root cause: Why does it fail? (precondition, spec, impl, quantifiers)
  4. Suggest fix: What needs to change?
  5. Check completeness: Are there other issues?
  6. Retest: Run tool again after fix
  7. Proceed: If no counterexample, attempt proof

Examples

For complete debugging examples including:

  • Incorrect sortedness definition
  • Missing preconditions
  • Wrong quantifier order
  • Off-by-one errors
  • Incomplete specifications

See examples.md

Tips

  • Trust the counterexample: If tool finds one, investigate thoroughly
  • Verify manually: Always check the counterexample by hand
  • Look for patterns: Empty, boundary, duplicates are common
  • Fix root cause: Don't just patch symptoms
  • Retest after fixing: Ensure fix is complete
  • Use tools early: Run Nitpick/QuickChick before proving
  • Iterate: May need multiple rounds of debugging
  • Document assumptions: Make preconditions explicit
  • Check completeness: Ensure specification is sufficient
Weekly Installs
1
GitHub Stars
47
First Seen
13 days ago
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1