fal-vision
fal-vision
Analyze and understand images using fal.ai vision models — segmentation, detection, OCR, captioning, and visual QA.
Scripts
| Script | Purpose |
|---|---|
analyze.sh |
Analyze an image (segment, detect, OCR, describe, QA) |
Usage
Segment Objects
./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation segment --query "the red car"
Detect Objects
./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation detect
Extract Text (OCR)
./scripts/analyze.sh --image-url "https://example.com/document.jpg" --operation ocr
Describe Image
./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation describe
Visual QA
./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation qa --query "How many people are in this image?"
Arguments
| Argument | Description | Required |
|---|---|---|
--image-url |
URL of image to analyze | Yes |
--operation |
segment, detect, ocr, describe, qa | Yes |
--query / -q |
Text prompt for segment/qa operations | For segment/qa |
--model / -m |
Override model endpoint | No |
Finding Models
To discover the best and latest vision/analysis models, use the search API:
# Search for segmentation models
bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "segmentation"
# Search for object detection models
bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "object detection"
# Search for OCR models
bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "ocr"
# Search for image captioning / visual QA models
bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "caption"
bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "visual question"
Or use the search_models MCP tool with keywords like "segmentation", "detection", "ocr", "caption", "vision".
More from fal-ai-community/skills
fal-image-edit
Edit images using AI on fal.ai. Style transfer, object removal, background changes, and more. Use when the user requests "Edit image", "Remove object", "Change background", "Apply style", or similar image editing tasks.
859fal-generate
Generate images and videos using fal.ai AI models with queue support. Use when the user requests "Generate image", "Create video", "Make a picture of...", "Text to image", "Image to video", "Search models", or similar generation tasks.
316fal-audio
Text-to-speech and speech-to-text using fal.ai audio models. Use when the user requests "Convert text to speech", "Transcribe audio", "Generate voice", "Speech to text", "TTS", "STT", or similar audio tasks.
269fal-upscale
Upscale and enhance image resolution using AI. Use when the user requests "Upscale image", "Enhance resolution", "Make image bigger", "Increase quality", or similar upscaling tasks.
243fal-platform
fal.ai Platform APIs for model management, pricing, usage tracking, and cost estimation. Use when user asks "show pricing", "check usage", "estimate cost", "setup fal", "add API key", or platform management tasks.
187fal-workflow
Generate production-ready fal.ai workflow JSON files. Use when user requests "create workflow", "chain models", "multi-step generation", "image to video pipeline", or complex AI generation pipelines.
181