NYC
skills/letta-ai/skills/pytorch-model-cli

pytorch-model-cli

SKILL.md

PyTorch Model to CLI Tool Conversion

This skill provides guidance for tasks that require converting PyTorch models into standalone command-line tools, typically implemented in C/C++ for portability and independence from Python runtime.

Task Recognition

This skill applies when the task involves:

  • Converting a PyTorch model to a standalone executable
  • Extracting model weights to a portable format (JSON, binary)
  • Implementing neural network inference in C/C++
  • Creating CLI tools that perform image classification or prediction
  • Building inference tools using libraries like cJSON and lodepng

Recommended Approach

Phase 1: Environment Analysis

Before writing any code, thoroughly analyze the available resources:

  1. Identify the model architecture

    • Read the model definition file (e.g., model.py) completely
    • Document all layer types, dimensions, and activation functions
    • Note any default parameters (hidden dimensions, number of classes)
  2. Examine available libraries

    • Check for image loading libraries (lodepng, stb_image)
    • Check for JSON parsing libraries (cJSON, nlohmann/json)
    • Identify compilation requirements (headers, source files)
  3. Understand input requirements

    • Determine expected image dimensions (e.g., 28x28 for MNIST)
    • Identify color format (grayscale, RGB, RGBA)
    • Document normalization requirements (divide by 255, mean/std normalization)
  4. Verify preprocessing pipeline

    • If training code is available, examine data transformations
    • Match inference preprocessing exactly to training preprocessing
    • Common transformations: resize, grayscale conversion, normalization

Phase 2: Weight Extraction

Extract model weights from PyTorch format to a portable format:

  1. Load the model checkpoint

    import torch
    import json
    
    # Load state dict
    state_dict = torch.load('model.pth', map_location='cpu')
    
  2. Convert tensors to lists

    weights = {}
    for key, tensor in state_dict.items():
        weights[key] = tensor.numpy().tolist()
    
  3. Save to JSON

    with open('weights.json', 'w') as f:
        json.dump(weights, f)
    
  4. Verify extraction

    • Check that all expected layer weights are present
    • Verify dimensions match the model architecture
    • For a model with layers fc1, fc2, fc3: expect fc1.weight, fc1.bias, etc.

Phase 3: Reference Implementation

Before implementing in C/C++, create a reference output:

  1. Run inference in PyTorch

    model.eval()
    with torch.no_grad():
        output = model(input_tensor)
        prediction = output.argmax().item()
    
  2. Save reference outputs

    • Store intermediate layer outputs for debugging
    • Record the final prediction for verification
    • This allows validating the C/C++ implementation

Phase 4: C/C++ Implementation

Implement the inference logic in C/C++:

  1. Image loading and preprocessing

    • Load image using the available library (lodepng for PNG)
    • Handle color channel conversion (RGBA to grayscale if needed)
    • Apply normalization (typically divide by 255.0)
    • Flatten to 1D array in correct order (row-major)
  2. Weight loading

    • Parse JSON file containing weights
    • Store weights in appropriate data structures
    • Verify dimensions during loading
  3. Forward pass implementation

    • Implement matrix-vector multiplication for linear layers
    • Implement activation functions (ReLU, softmax, etc.)
    • Process layers in correct order
  4. Output handling

    • Find argmax for classification tasks
    • Write prediction to output file
    • Ensure only prediction goes to stdout (not progress/debug info)

Phase 5: Compilation and Testing

  1. Compile with appropriate flags

    g++ -o cli_tool main.cpp lodepng.cpp cJSON.c -std=c++11 -lm
    
    • Double-check flag syntax (avoid concatenation errors like -std=c++11-lm)
  2. Test against reference

    • Run the CLI tool on the same input used for reference
    • Compare output to PyTorch reference
    • Debug any discrepancies by checking intermediate values

Verification Strategies

Before Implementation

  • Model architecture fully documented
  • All layer dimensions verified
  • Preprocessing requirements identified
  • Reference output generated from PyTorch

After Weight Extraction

  • All expected keys present in JSON
  • Weight dimensions match architecture
  • Bias terms included for all layers

After C/C++ Implementation

  • Compilation succeeds without warnings
  • Output matches PyTorch reference exactly
  • CLI tool handles missing files gracefully
  • Only prediction output goes to stdout

Final Validation

  • All test cases pass
  • Memory properly managed (no leaks)
  • Error messages go to stderr, not stdout

Common Pitfalls

Weight Extraction

  • Forgetting to use map_location='cpu' when loading on CPU-only systems
  • Missing bias terms - ensure both weights and biases are extracted
  • Incorrect tensor ordering - PyTorch uses different conventions than some C libraries

Preprocessing Mismatches

  • Wrong normalization - training might use mean/std normalization, not just /255
  • Color channel issues - PNG might be RGBA while model expects grayscale
  • Dimension ordering - ensure row-major vs column-major consistency

C/C++ Implementation

  • Matrix multiplication order - verify (input × weights^T) vs (weights × input)
  • Activation function placement - apply after linear layer, before next layer
  • Integer vs float division - use 255.0, not 255, for normalization

Compilation Issues

  • Flag concatenation - ensure spaces between compiler flags
  • Missing libraries - include all required source files (lodepng.cpp, cJSON.c)
  • Header dependencies - verify all headers are in include path

Output Handling

  • Verbose library output - suppress or redirect debug/progress output
  • Newline handling - ensure consistent line endings in output files
  • Buffering issues - flush stdout before program exit

Efficiency Guidelines

  • Avoid repeatedly checking package managers; identify available tools first
  • Create reference outputs early to catch implementation bugs quickly
  • Review complete code before compilation attempts
  • Minimize status-only updates; batch related operations
  • Test with multiple inputs when possible, not just the provided test case
Weekly Installs
19
Repository
letta-ai/skills
First Seen
Jan 24, 2026
Installed on
claude-code14
codex13
gemini-cli13
opencode13
antigravity12
windsurf10