beaver-image-gen
Image Generation (AI SDK)
Official API-based image generation. Supports OpenAI, Google, DashScope (阿里通义万象) and Replicate providers.
Script Directory
Agent Execution:
SKILL_DIR= this SKILL.md file's directory- Script path =
${SKILL_DIR}/scripts/main.ts
Step 0: Load Preferences ⛔ BLOCKING
CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.
Check EXTEND.md existence (priority: project → user):
test -f .beaver-skill/beaver-image-gen/EXTEND.md && echo "project"
test -f "$HOME/.beaver-skill/beaver-image-gen/EXTEND.md" && echo "user"
| Result | Action |
|---|---|
| Found | Load, parse, apply settings. If default_model.[provider] is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup (below) → Save EXTEND.md → Then continue |
CRITICAL: If not found, complete the full setup using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.
| Path | Location |
|---|---|
.beaver-skill/beaver-image-gen/EXTEND.md |
Project directory |
$HOME/.beaver-skill/beaver-image-gen/EXTEND.md |
User home |
EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models
Schema: references/config/preferences-schema.md
First-Time Setup (when no EXTEND.md found)
Use AskUserQuestion with ALL questions in ONE call:
- Default Provider: Google (Recommended) | OpenAI | DashScope | Replicate
- Default Model (per selected provider):
- Google:
gemini-3-pro-image-preview(Recommended) |gemini-3.1-flash-image-preview|gemini-3-flash-preview - OpenAI:
gpt-image-1.5(Recommended) |gpt-image-1 - DashScope:
z-image-turbo(Recommended) |z-image-ultra - Replicate:
google/nano-banana-pro(Recommended) |google/nano-banana
- Google:
- Default Quality:
2k(Recommended) |normal - Save Location: Project
.beaver-skill/(Recommended) | User~/.beaver-skill/
EXTEND.md template:
---
version: 1
default_provider: [selected or null]
default_quality: [selected or null]
default_aspect_ratio: null
default_image_size: null
default_model:
google: [selected or null]
openai: null
dashscope: null
replicate: null
---
If EXTEND.md exists but default_model.[current_provider] is null → ask ONLY the model question for that provider, then update EXTEND.md.
See references/config/first-time-setup.md for detailed flow diagrams.
Usage
# Basic
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google multimodal or OpenAI edits)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# Specific provider
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# Replicate (google/nano-banana-pro)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate with specific model
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
Options
| Option | Description |
|---|---|
--prompt <text>, -p |
Prompt text |
--promptfiles <files...> |
Read prompt from files (concatenated) |
--image <path> |
Output image path (required) |
--provider google|openai|dashscope|replicate |
Force provider (default: google) |
--model <id>, -m |
Model ID (Google: gemini-3-pro-image-preview, gemini-3.1-flash-image-preview; OpenAI: gpt-image-1.5) |
--ar <ratio> |
Aspect ratio (e.g., 16:9, 1:1, 4:3) |
--size <WxH> |
Size (e.g., 1024x1024) |
--quality normal|2k |
Quality preset (default: 2k) |
--imageSize 1K|2K|4K |
Image size for Google (default: from quality) |
--ref <files...> |
Reference images. Supported by Google multimodal (gemini-3-pro-image-preview, gemini-3-flash-preview, gemini-3.1-flash-image-preview) and OpenAI edits (GPT Image models). If provider omitted: Google first, then OpenAI |
--n <count> |
Number of images. Supported: OpenAI (except dall-e-3), Imagen, Replicate. Gemini multimodal and chat completions mode: warns and generates 1 |
--json |
JSON output |
Environment Variables
| Variable | Description |
|---|---|
OPENAI_API_KEY |
OpenAI API key |
GOOGLE_API_KEY |
Google API key |
DASHSCOPE_API_KEY |
DashScope API key (阿里云) |
REPLICATE_API_TOKEN |
Replicate API token |
OPENAI_IMAGE_MODEL |
OpenAI model override |
GOOGLE_IMAGE_MODEL |
Google model override |
DASHSCOPE_IMAGE_MODEL |
DashScope model override (default: z-image-turbo) |
REPLICATE_IMAGE_MODEL |
Replicate model override (default: google/nano-banana-pro) |
OPENAI_BASE_URL |
Custom OpenAI endpoint |
GOOGLE_BASE_URL |
Custom Google endpoint |
DASHSCOPE_BASE_URL |
Custom DashScope endpoint |
REPLICATE_BASE_URL |
Custom Replicate endpoint |
Load Priority: CLI args > EXTEND.md > env vars > <cwd>/.beaver-skill/.env > ~/.beaver-skill/.env
Model Resolution
Model priority (highest → lowest), applies to all providers:
- CLI flag:
--model <id> - EXTEND.md:
default_model.[provider] - Env var:
<PROVIDER>_IMAGE_MODEL(e.g.,GOOGLE_IMAGE_MODEL) - Built-in default
EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins.
Agent MUST display model info before each generation:
- Show:
Using [provider] / [model] - Show switch hint:
Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL
Replicate Models
Supported model formats:
owner/name(recommended for official models), e.g.google/nano-banana-proowner/name:version(community models by version), e.g.stability-ai/sdxl:<version>
Examples:
# Use Replicate default model
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Override model explicitly
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
Provider Selection
--refprovided + no--provider→ auto-select Google first, then OpenAI, then Replicate--providerspecified → use it (if--ref, must begoogle,openai, orreplicate)- Only one API key available → use that provider
- Multiple available → default to Google
Quality Presets
| Preset | Google imageSize | OpenAI Size | Use Case |
|---|---|---|---|
normal |
1K | 1024px | Quick previews |
2k (default) |
2K | 2048px | Covers, illustrations, infographics |
Google imageSize: Can be overridden with --imageSize 1K|2K|4K
Aspect Ratios
Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1
- Google multimodal: uses
imageConfig.aspectRatio - Google Imagen: uses
aspectRatioparameter - OpenAI: maps to closest supported size
Generation Mode
Default: Sequential generation (one image at a time). This ensures stable output and easier debugging.
Parallel Generation: Only use when user explicitly requests parallel/concurrent generation.
| Mode | When to Use |
|---|---|
| Sequential (default) | Normal usage, single images, small batches |
| Parallel | User explicitly requests, large batches (10+) |
Parallel Settings (when requested):
| Setting | Value |
|---|---|
| Recommended concurrency | 4 subagents |
| Max concurrency | 8 subagents |
| Use case | Large batch generation when user requests parallel |
Agent Implementation (parallel mode only):
# Launch multiple generations in parallel using Task tool
# Each Task runs as background subagent with run_in_background=true
# Collect results via TaskOutput when all complete
Error Handling
- Missing API key → error with setup instructions
- Generation failure → auto-retry once
- Invalid aspect ratio → warning, proceed with default
- Reference images with unsupported provider/model → error with fix hint (switch to Google multimodal:
gemini-3-pro-image-preview,gemini-3.1-flash-image-preview; or OpenAI GPT Image edits)
Extension Support
Custom configurations via EXTEND.md. See Preferences section for paths and supported options.
More from beaverslab/beaver-skill
beaver-xhs-images
Generates Xiaohongshu (Little Red Book) infographic series with 10 visual styles and 8 layouts. Breaks content into 1-10 cartoon-style images optimized for XHS engagement. Use when user mentions "小红书图片", "小红书笔记", "小红书配图", "小红书种草", "XHS images", "XHS cards", "RedNote infographics", or wants social media infographics for Chinese platforms.
8beaver-rss-digest
Generate configurable RSS digest with YAML-driven LLM chain, source management, prompt customization, i18n, and template-based report output.
5beaver-release-skills
Dual-release workflow for the Beaver Skills library. Use "release skill" to update the unified skills version or "release package <name>" to update independent CLI tools (NPM, Rust, Python, Go). Powered by Changesets.
5beaver-markdown-i18n
Translate and sync markdown documentation between languages (EN↔ZH/JA/KO). Handles new document translation and incremental sync when source updates. Preserves code blocks, frontmatter structure, links, and variables. Use when translating docs, localizing markdown, syncing i18n files, or when the user mentions translation, localization, multilingual documentation, 翻译, ローカライズ, or 번역.
3beaver-claw-backup
Backup, restore, and migrate workspace data or tool configurations. Use when the user mentions "backup", "restore", "snapshot", "migrate", "备份", "恢复", "还原", "迁移", or "存档".
1beaver-cover-image
Generates article cover images with 5 dimensions (type, palette, rendering, text, mood) combining 9 color palettes and 6 rendering styles. Supports cinematic (2.35:1), widescreen (16:9), and square (1:1) aspects. Use when user asks to "generate cover image", "create article cover", or "make cover".
1