verified-pseudocode-extractor
Verified Pseudocode Extractor
Extract language-agnostic pseudocode from formally verified programs while preserving verified properties and algorithmic structure.
Core Principles
Semantic Faithfulness
- Preserve exact control flow from verified code
- Maintain all data dependencies
- Keep algorithmic logic intact
- Do not introduce new behavior
- Do not omit verified steps
Verification Preservation
- Mark verified properties explicitly
- Retain preconditions and postconditions
- Include loop invariants
- Note termination arguments
- Distinguish verified from unverified components
Abstraction Without Loss
- Remove language-specific syntax
- Remove proof-specific code
- Keep essential algorithmic structure
- Maintain readability
- Preserve correctness guarantees
Workflow
1. Identify Verified Components
Scan the input code for:
- Function definitions:
fun,Fixpoint,Definition - Theorems:
lemma,theorem,Theorem,Lemma - Specifications:
assumes/shows, preconditions/postconditions - Invariants: Loop invariants, inductive invariants
- Helper lemmas: Supporting correctness proofs
2. Extract Core Algorithm
For each verified function:
Preserve:
- Function signature (name, parameters, return type)
- Control flow (if/then/else, pattern matching, recursion)
- Data flow (variable assignments, computations)
- Termination structure (base cases, recursive calls)
Remove:
- Proof tactics (
by simp,auto,lia,Proof...Qed) - Type system details (type classes, implicit arguments)
- Language-specific syntax (
#vs::,@vs++) - Proof-only lemmas
- Module qualifications
3. Extract Verification Annotations
Identify and mark:
Preconditions:
PRECONDITION: condition [VERIFIED: lemma_name]
Postconditions:
POSTCONDITION: property [VERIFIED: theorem_name]
Invariants:
INVARIANT: property [VERIFIED]
Termination:
TERMINATION: argument [VERIFIED: termination_proof]
Unverified components:
[ASSUMED: assumption]
[UNVERIFIED: property]
4. Structure the Pseudocode
Use clear, structured format:
ALGORITHM: Name
VERIFIED IN: System (theory/module name)
FUNCTION name(params: Types) -> ReturnType
DESCRIPTION: Brief description
PRECONDITION: conditions [VERIFIED: source]
POSTCONDITION: guarantees [VERIFIED: source]
[Algorithm body in structured pseudocode]
TERMINATION: argument [VERIFIED]
INVARIANT: properties [VERIFIED]
VERIFIED PROPERTIES:
1. Property 1 [VERIFIED: theorem]
2. Property 2 [VERIFIED: lemma]
5. Verify Semantic Equivalence
Check that pseudocode:
- Has same control flow as original
- Performs same computations
- Maintains same data dependencies
- Preserves all verified properties
- Contains no new behavior
- Omits no essential steps
Extraction Patterns
Pattern Matching
From (Isabelle):
case xs of [] ⇒ base | (x # xs) ⇒ recursive
From (Coq):
match l with | [] => base | x :: xs => recursive end
To (Pseudocode):
MATCH list WITH
CASE []: base
CASE x :: xs: recursive
Recursion
From:
fun f :: "nat ⇒ nat" where
"f 0 = 0" |
"f (Suc n) = f n + 1"
To:
FUNCTION f(n: Nat) -> Nat
IF n = 0 THEN
RETURN 0
ELSE
RETURN f(n - 1) + 1
[VERIFIED: terminates (decreasing n)]
Preconditions/Postconditions
From (Isabelle):
lemma function_correct:
assumes "precondition x"
shows "postcondition (function x)"
To:
FUNCTION function(x: Type) -> Type
PRECONDITION: precondition(x) [VERIFIED: function_correct]
POSTCONDITION: postcondition(result) [VERIFIED: function_correct]
Dependent Types
From (Coq):
Definition safe_head (l : list A) (H : l <> []) : A
To:
FUNCTION safe_head(l: List<A>) -> A
PRECONDITION: l ≠ [] [VERIFIED: type system]
Verification Annotations
Verified Components
Mark with [VERIFIED] or [VERIFIED: source]:
PRECONDITION: is_sorted(list) [VERIFIED: sort_correct theorem]
POSTCONDITION: result ≤ all elements [VERIFIED: max_correct lemma]
TERMINATION: list size decreases [VERIFIED: structural recursion]
Unverified Components
Mark clearly:
[ASSUMED: input is well-formed]
[UNVERIFIED: time complexity O(n log n)]
[UNVERIFIED: space complexity]
Partial Verification
Be specific:
[VERIFIED: correctness]
[UNVERIFIED: termination]
Unreachable Code
Mark when preconditions make cases impossible:
CASE []:
UNREACHABLE [precondition ensures list is non-empty]
What to Preserve vs. Remove
Always Preserve
✓ Function names and signatures ✓ Control flow structure ✓ Data dependencies ✓ Algorithmic steps ✓ Verified properties ✓ Preconditions and postconditions ✓ Loop invariants ✓ Termination arguments
Always Remove
✗ Proof tactics and commands
✗ Proof scripts (proof...qed, Proof...Qed)
✗ Type class constraints (unless essential)
✗ Proof-only helper lemmas
✗ Language-specific syntax sugar
✗ Module system details
✗ Proof annotations
✗ Type inference hints
Context-Dependent
? Type annotations: Keep if essential for understanding ? Helper functions: Keep if used in algorithm, remove if proof-only ? Definitions: Keep if part of algorithm, remove if proof infrastructure
Output Format
Use structured, readable format:
ALGORITHM: [Name]
VERIFIED IN: [System] ([Module/Theory])
═══════════════════════════════════════════════════════════
FUNCTION name(params: Types) -> ReturnType
DESCRIPTION: [Brief description]
PRECONDITION: [conditions] [VERIFIED: source]
POSTCONDITION: [guarantees] [VERIFIED: source]
[Pseudocode body with clear structure]
TERMINATION: [argument] [VERIFIED]
INVARIANT: [properties] [VERIFIED]
═══════════════════════════════════════════════════════════
VERIFIED PROPERTIES:
1. [Property] [VERIFIED: theorem]
2. [Property] [VERIFIED: lemma]
...
Quality Checks
Before finalizing pseudocode:
- Completeness: All algorithmic steps included?
- Correctness: Control flow matches original?
- Clarity: Readable and understandable?
- Verification: All verified properties marked?
- Faithfulness: Semantically equivalent to original?
- No additions: No new behavior introduced?
- No omissions: No essential steps removed?
Examples
For complete extraction examples including:
- Insertion sort (Isabelle/HOL)
- Binary search (Coq)
- Safe array access (Coq with dependent types)
- GCD algorithm (Isabelle/HOL)
See examples.md
For detailed extraction patterns and rules: See extraction_patterns.md
Tips
- Read proofs carefully: Understand what's verified before extracting
- Preserve structure: Keep control flow exactly as in original
- Mark everything: Be explicit about verification status
- Stay faithful: Don't simplify away essential complexity
- Remove proofs: But keep what they prove
- Use clear syntax: Make pseudocode readable
- Add comments: Explain non-obvious logic
- Check equivalence: Verify semantic faithfulness
- Distinguish verified/unverified: Be honest about what's proven
- Keep it language-agnostic: Avoid source language idioms