qwencloud-vision
SKILL.md
Agent setup: If your agent doesn't auto-load skills (e.g. Claude Code), see agent-compatibility.md once per session.
Qwen Vision (Image & Video Understanding)
Analyze images and videos using Qwen VL and QVQ models. This skill is part of qwencloud/qwencloud-ai.
Skill directory
Use this skill's internal files to execute and learn. Load reference files on demand when the default path fails or you need details.
| Location | Purpose |
|---|---|
scripts/analyze.py |
Image/video understanding, multi-image, thinking mode |
scripts/reason.py |
Visual reasoning (QVQ, chain-of-thought, streaming) |
scripts/ocr.py |
OCR text extraction |
scripts/vision_lib.py |
Shared helpers (base64, upload, streaming) |
references/execution-guide.md |
Fallback: curl, code generation |