smolvlm
SKILL.md
SmolVLM - Local Image Analysis
Analyze images locally using SmolVLM-2B, a state-of-the-art compact vision-language model optimized for Apple Silicon via mlx-vlm.
Quick Usage
Describe an Image
python ~/.claude/skills/smolvlm/scripts/view_image.py /path/to/image.png
Ask a Question About an Image
python ~/.claude/skills/smolvlm/scripts/view_image.py /path/to/image.png "What text is visible?"
Specific Tasks
# Extract text (OCR)
python ~/.claude/skills/smolvlm/scripts/view_image.py screenshot.png "Extract all text"
# UI analysis
python ~/.claude/skills/smolvlm/scripts/view_image.py ui.png "Describe the UI elements"
# Detailed description
python ~/.claude/skills/smolvlm/scripts/view_image.py photo.jpg --detailed
Effective Prompts
General Description
"Describe this image"- Basic description"Describe this image in detail, including colors, composition, and any text"- Comprehensive
Text Extraction (OCR)
"Extract all visible text from this image""What text appears in this screenshot?""Read the text in this document"
UI/Screenshot Analysis
"Describe the user interface elements""What buttons and controls are visible?""Identify the application and its current state"
Visual Question Answering
"How many [objects] are in this image?""What color is the [object]?""Is there a [object] in this image?"
Code/Technical
"What programming language is shown?""Describe what this code does""Identify any errors in this code screenshot"
Model Details
| Spec | Value |
|---|---|
| Model | SmolVLM-2B-Instruct |
| Size | ~4GB |
| Peak Memory | 5.8GB |
| Speed | ~94 tok/s (M-series) |
| Supported Formats | PNG, JPG, JPEG, GIF, WebP |
Requirements
- macOS with Apple Silicon (M1/M2/M3)
- Python 3.10+
- mlx-vlm package:
uv pip install mlx-vlm --system
Troubleshooting
"Model not found": First run downloads the model (~4GB). Wait for completion.
Out of memory: Close other applications. Model needs ~6GB free RAM.
Slow first inference: Model loading takes 10-15s on first use, subsequent calls are faster.
Weekly Installs
27
Repository
tdimino/claude-…e-minoanGitHub Stars
12
First Seen
Feb 21, 2026
Security Audits
Installed on
opencode27
gemini-cli27
github-copilot27
amp27
codex27
kimi-cli27