pytorch-model-cli
PyTorch Model to CLI Tool Conversion
This skill provides guidance for tasks that require converting PyTorch models into standalone command-line tools, typically implemented in C/C++ for portability and independence from Python runtime.
Task Recognition
This skill applies when the task involves:
- Converting a PyTorch model to a standalone executable
- Extracting model weights to a portable format (JSON, binary)
- Implementing neural network inference in C/C++
- Creating CLI tools that perform image classification or prediction
- Building inference tools using libraries like cJSON and lodepng
Recommended Approach
Phase 1: Environment Analysis
Before writing any code, thoroughly analyze the available resources:
-
Identify the model architecture
- Read the model definition file (e.g.,
model.py) completely - Document all layer types, dimensions, and activation functions
- Note any default parameters (hidden dimensions, number of classes)
- Read the model definition file (e.g.,
-
Examine available libraries
- Check for image loading libraries (lodepng, stb_image)
- Check for JSON parsing libraries (cJSON, nlohmann/json)
- Identify compilation requirements (headers, source files)
-
Understand input requirements
- Determine expected image dimensions (e.g., 28x28 for MNIST)
- Identify color format (grayscale, RGB, RGBA)
- Document normalization requirements (divide by 255, mean/std normalization)
-
Verify preprocessing pipeline
- If training code is available, examine data transformations
- Match inference preprocessing exactly to training preprocessing
- Common transformations: resize, grayscale conversion, normalization
Phase 2: Weight Extraction
Extract model weights from PyTorch format to a portable format:
-
Load the model checkpoint
import torch import json # Load state dict state_dict = torch.load('model.pth', map_location='cpu') -
Convert tensors to lists
weights = {} for key, tensor in state_dict.items(): weights[key] = tensor.numpy().tolist() -
Save to JSON
with open('weights.json', 'w') as f: json.dump(weights, f) -
Verify extraction
- Check that all expected layer weights are present
- Verify dimensions match the model architecture
- For a model with layers fc1, fc2, fc3: expect fc1.weight, fc1.bias, etc.
Phase 3: Reference Implementation
Before implementing in C/C++, create a reference output:
-
Run inference in PyTorch
model.eval() with torch.no_grad(): output = model(input_tensor) prediction = output.argmax().item() -
Save reference outputs
- Store intermediate layer outputs for debugging
- Record the final prediction for verification
- This allows validating the C/C++ implementation
Phase 4: C/C++ Implementation
Implement the inference logic in C/C++:
-
Image loading and preprocessing
- Load image using the available library (lodepng for PNG)
- Handle color channel conversion (RGBA to grayscale if needed)
- Apply normalization (typically divide by 255.0)
- Flatten to 1D array in correct order (row-major)
-
Weight loading
- Parse JSON file containing weights
- Store weights in appropriate data structures
- Verify dimensions during loading
-
Forward pass implementation
- Implement matrix-vector multiplication for linear layers
- Implement activation functions (ReLU, softmax, etc.)
- Process layers in correct order
-
Output handling
- Find argmax for classification tasks
- Write prediction to output file
- Ensure only prediction goes to stdout (not progress/debug info)
Phase 5: Compilation and Testing
-
Compile with appropriate flags
g++ -o cli_tool main.cpp lodepng.cpp cJSON.c -std=c++11 -lm- Double-check flag syntax (avoid concatenation errors like
-std=c++11-lm)
- Double-check flag syntax (avoid concatenation errors like
-
Test against reference
- Run the CLI tool on the same input used for reference
- Compare output to PyTorch reference
- Debug any discrepancies by checking intermediate values
Verification Strategies
Before Implementation
- Model architecture fully documented
- All layer dimensions verified
- Preprocessing requirements identified
- Reference output generated from PyTorch
After Weight Extraction
- All expected keys present in JSON
- Weight dimensions match architecture
- Bias terms included for all layers
After C/C++ Implementation
- Compilation succeeds without warnings
- Output matches PyTorch reference exactly
- CLI tool handles missing files gracefully
- Only prediction output goes to stdout
Final Validation
- All test cases pass
- Memory properly managed (no leaks)
- Error messages go to stderr, not stdout
Common Pitfalls
Weight Extraction
- Forgetting to use
map_location='cpu'when loading on CPU-only systems - Missing bias terms - ensure both weights and biases are extracted
- Incorrect tensor ordering - PyTorch uses different conventions than some C libraries
Preprocessing Mismatches
- Wrong normalization - training might use mean/std normalization, not just /255
- Color channel issues - PNG might be RGBA while model expects grayscale
- Dimension ordering - ensure row-major vs column-major consistency
C/C++ Implementation
- Matrix multiplication order - verify (input × weights^T) vs (weights × input)
- Activation function placement - apply after linear layer, before next layer
- Integer vs float division - use 255.0, not 255, for normalization
Compilation Issues
- Flag concatenation - ensure spaces between compiler flags
- Missing libraries - include all required source files (lodepng.cpp, cJSON.c)
- Header dependencies - verify all headers are in include path
Output Handling
- Verbose library output - suppress or redirect debug/progress output
- Newline handling - ensure consistent line endings in output files
- Buffering issues - flush stdout before program exit
Efficiency Guidelines
- Avoid repeatedly checking package managers; identify available tools first
- Create reference outputs early to catch implementation bugs quickly
- Review complete code before compilation attempts
- Minimize status-only updates; batch related operations
- Test with multiple inputs when possible, not just the provided test case