add-image-vision
Image Vision Skill
Adds the ability for NanoClaw agents to see and understand images sent via WhatsApp. Images are downloaded, resized with sharp, saved to the group workspace, and passed to the agent as base64-encoded multimodal content blocks.
Phase 1: Pre-flight
- Check if
src/image.tsexists — skip to Phase 3 if already applied - Confirm
sharpis installable (native bindings require build tools)
Prerequisite: WhatsApp must be installed first (skill/whatsapp merged). This skill modifies WhatsApp channel files.
Phase 2: Apply Code Changes
Ensure WhatsApp fork remote
git remote -v
If whatsapp is missing, add it:
git remote add whatsapp https://github.com/qwibitai/nanoclaw-whatsapp.git
Merge the skill branch
git fetch whatsapp skill/image-vision
git merge whatsapp/skill/image-vision || {
git checkout --theirs package-lock.json
git add package-lock.json
git merge --continue
}
This merges in:
src/image.ts(image download, resize via sharp, base64 encoding)src/image.test.ts(8 unit tests)- Image attachment handling in
src/channels/whatsapp.ts - Image passing to agent in
src/index.tsandsrc/container-runner.ts - Image content block support in
container/agent-runner/src/index.ts sharpnpm dependency inpackage.json
If the merge reports conflicts, resolve them by reading the conflicted files and understanding the intent of both sides.
Validate code changes
npm install
npm run build
npx vitest run src/image.test.ts
All tests must pass and build must be clean before proceeding.
Phase 3: Configure
-
Rebuild the container (agent-runner changes need a rebuild):
./container/build.sh -
Sync agent-runner source to group caches:
for dir in data/sessions/*/agent-runner-src/; do cp container/agent-runner/src/*.ts "$dir" done -
Restart the service:
launchctl kickstart -k gui/$(id -u)/com.nanoclaw
Phase 4: Verify
- Send an image in a registered WhatsApp group
- Check the agent responds with understanding of the image content
- Check logs for "Processed image attachment":
tail -50 groups/*/logs/container-*.log
Troubleshooting
- "Image - download failed": Check WhatsApp connection stability. The download may timeout on slow connections.
- "Image - processing failed": Sharp may not be installed correctly. Run
npm ls sharpto verify. - Agent doesn't mention image content: Check container logs for "Loaded image" messages. If missing, ensure agent-runner source was synced to group caches.
More from qwibitai/nanoclaw
debug
Debug container agent issues. Use when things aren't working, container fails, authentication problems, or to understand how the container system works. Covers logs, environment variables, mounts, and common issues.
20add-whatsapp
Add WhatsApp channel via native Baileys adapter. Direct connection — no Chat SDK bridge. Uses QR code or pairing code for authentication.
12add-telegram
Add Telegram channel integration via Chat SDK.
12update-nanoclaw
Efficiently bring upstream NanoClaw updates into a customized install, with preview, selective cherry-pick, and low token usage.
11customize
Add new capabilities or modify NanoClaw behavior. Use when user wants to add channels (Telegram, Slack, email input), change triggers, add integrations, modify the router, or make any other customizations. This is an interactive skill that asks questions to understand what the user wants.
10qodo-pr-resolver
Review and resolve PR issues with Qodo - get AI-powered code review issues and fix them interactively (GitHub, GitLab, Bitbucket, Azure DevOps)
10