c-cpp-to-lean4-translator
C/C++ to Lean4 Translator
Overview
Transform C or C++ programs into equivalent Lean4 code that preserves the original semantics while leveraging Lean4's functional programming paradigm, strong type system, and proof capabilities.
Translation Workflow
Step 1: Analyze Input Code
Understand the C/C++ program structure and semantics:
-
Identify program components:
- Functions and their signatures
- Data structures (structs, classes, arrays)
- Control flow patterns (loops, conditionals)
- Memory management (allocation, pointers)
- I/O operations
- Dependencies and includes
-
Understand semantics:
- What does the program compute?
- What are the inputs and outputs?
- Are there side effects?
- What are the invariants and preconditions?
-
Note translation challenges:
- Pointer arithmetic
- Mutable state
- Imperative loops
- Manual memory management
- Undefined behavior
Step 2: Design Lean4 Structure
Plan the Lean4 equivalent before writing code:
-
Choose appropriate types:
Intfor signed integersNatfor unsigned integers and array indicesFloatfor floating-point numbersArrayfor dynamic arraysListfor linked lists- Custom
structuretypes for structs/classes
-
Determine purity:
- Pure functions: return values directly
- Side effects: use
IOmonad - Mutable state: use
IO.ReforSTmonad
-
Plan control flow translation:
- Loops → Recursive functions
- Mutable variables → Function parameters
- Early returns → Conditional expressions
-
Handle memory:
- Stack allocation → Direct values
- Heap allocation → Automatic memory management
- Pointers → Direct values or references
Step 3: Translate Code
Follow these translation principles:
Functions
Pattern: Pure function
// C/C++
int add(int a, int b) {
return a + b;
}
-- Lean4
def add (a b : Int) : Int :=
a + b
Pattern: Function with side effects
// C/C++
void printSum(int a, int b) {
printf("%d\n", a + b);
}
-- Lean4
def printSum (a b : Int) : IO Unit :=
IO.println (a + b)
Control Flow
Pattern: If-else
// C/C++
int max(int a, int b) {
if (a > b) return a;
else return b;
}
-- Lean4
def max (a b : Int) : Int :=
if a > b then a else b
Pattern: For loop → Tail recursion
// C/C++
int sum(int n) {
int result = 0;
for (int i = 0; i < n; i++) {
result += i;
}
return result;
}
-- Lean4
def sum (n : Nat) : Nat :=
let rec loop (i acc : Nat) : Nat :=
if i >= n then acc
else loop (i + 1) (acc + i)
loop 0 0
Pattern: While loop → Recursion
// C/C++
int factorial(int n) {
int result = 1;
while (n > 1) {
result *= n;
n--;
}
return result;
}
-- Lean4
def factorial (n : Nat) : Nat :=
let rec loop (n acc : Nat) : Nat :=
if n <= 1 then acc
else loop (n - 1) (acc * n)
loop n 1
Data Structures
Pattern: Struct
// C/C++
struct Point {
int x;
int y;
};
-- Lean4
structure Point where
x : Int
y : Int
deriving Repr
Pattern: Array
// C/C++
int arr[5] = {1, 2, 3, 4, 5};
-- Lean4
def arr : Array Int := #[1, 2, 3, 4, 5]
Pointers and References
Key principle: Lean4 doesn't have raw pointers. Translate based on usage:
- Read-only pointers: Pass by value
- Output parameters: Return values (use tuples for multiple returns)
- Mutable references: Use
IO.Refor return new values
// C/C++ - Output parameter
void swap(int* a, int* b) {
int temp = *a;
*a = *b;
*b = temp;
}
-- Lean4 - Return tuple
def swap (a b : Int) : Int × Int :=
(b, a)
Step 4: Ensure Type Correctness
Lean4's type system is strict. Address common type issues:
-
Integer types:
- Use
Natfor non-negative values (array indices, counts) - Use
Intfor potentially negative values - Convert explicitly:
n.toNat,n.toInt
- Use
-
Array bounds:
- Lean4 requires proof of valid indices
- Use safe accessors:
arr.get?,arr[i]? - Or use
arr[i]!with runtime check
-
Division:
- Natural number division:
n / m(rounds down) - Integer division: use
Int.div - Handle division by zero explicitly
- Natural number division:
-
Type annotations:
- Add explicit types when inference fails
- Use
: Typefor clarity
Step 5: Test and Verify
Ensure the translated code works correctly:
-
Compile check:
lake build -
Create test cases:
#eval add 2 3 -- Should output 5 #eval factorial 5 -- Should output 120 -
Compare outputs:
- Run original C/C++ program
- Run translated Lean4 program
- Verify outputs match for same inputs
-
Handle edge cases:
- Empty arrays
- Zero values
- Negative numbers
- Boundary conditions
Step 6: Optimize and Refine
Improve the translated code:
-
Use Lean4 idioms:
- Replace manual recursion with
List.foldl,Array.foldl - Use pattern matching instead of nested if-else
- Leverage standard library functions
- Replace manual recursion with
-
Add documentation:
/-- Calculate the sum of first n natural numbers -/ def sum (n : Nat) : Nat := n * (n + 1) / 2 -
Consider performance:
- Use tail recursion for loops
- Prefer
ArrayoverListfor random access - Use
@[inline]for small functions
Common Translation Patterns
For detailed patterns, see translation_patterns.md.
Quick Reference
| C/C++ | Lean4 |
|---|---|
int x |
def x : Int |
unsigned int x |
def x : Nat |
float x |
def x : Float |
bool x |
def x : Bool |
char* str |
def str : String |
int arr[] |
def arr : Array Int |
struct S |
structure S where |
for (...) |
let rec loop ... |
while (...) |
let rec loop ... |
if (...) {...} |
if ... then ... else ... |
switch (...) |
match ... with |
return x |
x (last expression) |
void f() |
def f : IO Unit |
printf(...) |
IO.println ... |
Examples
Example 1: Simple Algorithm
C/C++ Input:
int gcd(int a, int b) {
while (b != 0) {
int temp = b;
b = a % b;
a = temp;
}
return a;
}
Lean4 Output:
def gcd (a b : Nat) : Nat :=
if b = 0 then a
else gcd b (a % b)
Example 2: Array Processing
C/C++ Input:
int findMax(int arr[], int size) {
int max = arr[0];
for (int i = 1; i < size; i++) {
if (arr[i] > max) {
max = arr[i];
}
}
return max;
}
Lean4 Output:
def findMax (arr : Array Int) : Option Int :=
if arr.isEmpty then
none
else
some (arr.foldl max arr[0]!)
Example 3: Struct with Methods
C/C++ Input:
struct Rectangle {
int width;
int height;
int area() {
return width * height;
}
int perimeter() {
return 2 * (width + height);
}
};
Lean4 Output:
structure Rectangle where
width : Nat
height : Nat
deriving Repr
def Rectangle.area (r : Rectangle) : Nat :=
r.width * r.height
def Rectangle.perimeter (r : Rectangle) : Nat :=
2 * (r.width + r.height)
Example 4: I/O Program
C/C++ Input:
#include <stdio.h>
int main() {
int n;
printf("Enter a number: ");
scanf("%d", &n);
printf("Factorial: %d\n", factorial(n));
return 0;
}
Lean4 Output:
def factorial (n : Nat) : Nat :=
if n <= 1 then 1
else n * factorial (n - 1)
def main : IO Unit := do
IO.print "Enter a number: "
let input ← IO.getStdIn >>= (·.getLine)
match input.trim.toNat? with
| some n =>
IO.println s!"Factorial: {factorial n}"
| none =>
IO.println "Invalid input"
Best Practices
- Start simple: Translate basic functions first, then build up complexity
- Preserve semantics: Ensure the Lean4 code computes the same results
- Use types wisely: Leverage Lean4's type system for correctness
- Embrace immutability: Prefer pure functions over mutable state
- Test thoroughly: Verify outputs match for various inputs
- Document assumptions: Note any semantic differences or limitations
- Leverage standard library: Use built-in functions when available
- Handle errors gracefully: Use
OptionorExceptfor error cases
Limitations and Considerations
- Undefined behavior: C/C++ undefined behavior must be handled explicitly in Lean4
- Performance: Functional code may have different performance characteristics
- Concurrency: C/C++ threading requires different approaches in Lean4
- Low-level operations: Bit manipulation and pointer arithmetic need careful translation
- External libraries: C/C++ library calls may not have direct Lean4 equivalents
- Macros: C preprocessor macros need manual translation
- Templates: C++ templates translate to Lean4 generics differently
Resources
- Translation patterns: See translation_patterns.md for comprehensive pattern catalog
- Lean4 documentation: https://lean-lang.org/documentation/
- Lean4 standard library: https://github.com/leanprover/lean4/tree/master/src/Init