image-analysis
Image Analysis
Analyze local images using the project's configured LLM (via litellm). Works with any vision-capable model (GPT-4o, Claude 3, Gemini, etc.).
Quick start
# Analyze an image with a question
python3 scripts/analyze_image.py --image "/path/to/image.png" --prompt "Describe what you see in the image"
# Use a specific model (overrides project config)
python3 scripts/analyze_image.py --image "/path/to/photo.jpg" --prompt "What text is visible?" --model "openai/gpt-4o"
# Basic mode: extract image metadata without LLM (no API key needed)
python3 scripts/analyze_image.py --image "/path/to/image.png" --basic
# Increase output length and timeout
python3 scripts/analyze_image.py --image "/path/to/diagram.png" --prompt "Explain this diagram" --max-tokens 4096 --timeout 120
Options
| Flag | Description | Default |
|---|---|---|
--image |
Path to local image file (required) | — |
--prompt |
Question or instruction for the image | "Describe this image in detail" |
--model |
Override model id (e.g. openai/gpt-4o) |
project config |
--max-tokens |
Max output tokens | 2048 |
--timeout |
HTTP timeout in seconds | 60 |
--basic |
Extract image metadata only (no LLM needed) | off |
Model configuration
The script reads model/API configuration from the project's config (middleware/config). Ensure your configured model supports vision (multimodal) input.
Override with --model to use a specific model for this call.
Basic mode
When --basic is used (or when no LLM is configured), the script uses Pillow to extract:
- Image format, size, color mode
- EXIF metadata (camera, date, GPS if available)
- Color statistics (dominant colors, histogram)
Supported formats
PNG, JPEG, GIF, WebP, BMP, TIFF, and other common image formats.
More from memento-teams/memento-skills
filesystem
Direct filesystem operations (read, write, edit, list, search files). Use for any file manipulation tasks.
12docx
Use this skill whenever the user wants to create, read, edit, or manipulate Word documents (.docx files). Triggers include: any mention of \"Word doc\", \"word document\", \".docx\", or requests to produce professional documents with formatting like tables of contents, headings, page numbers, or letterheads. Also use when extracting or reorganizing content from .docx files, inserting or replacing images in documents, performing find-and-replace in Word files, working with tracked changes or comments, or converting content into a polished Word document. If the user asks for a \"report\", \"memo\", \"letter\", \"template\", or similar deliverable as a Word or .docx file, use this skill. Do NOT use for PDFs, spreadsheets, Google Docs, or general coding tasks unrelated to document generation.
9web-search
Web search and content fetching. Use when the user needs to search the web for information or fetch content from URLs.
9skill-creator
Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
9pptx
Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in an email or summary); editing, modifying, or updating existing presentations; combining or splitting slide files; working with templates, layouts, speaker notes, or comments. Trigger whenever the user mentions \"deck,\" \"slides,\" \"presentation,\" or references a .pptx filename, regardless of what they plan to do with the content afterward. If a .pptx file needs to be opened, created, or touched, use this skill.
8pdf
Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.
8