baoyu-image-gen
Image Generation (AI SDK)
Official API-based image generation. Supports OpenAI, Google and DashScope (阿里通义万象) providers.
Script Directory
Agent Execution:
SKILL_DIR= this SKILL.md file's directory- Script path =
${SKILL_DIR}/scripts/main.ts
Preferences (EXTEND.md)
Use Bash to check EXTEND.md existence (priority order):
# Check project-level first
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
# Then user-level (cross-platform: $HOME works on macOS/Linux/WSL)
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"
┌──────────────────────────────────────────────────┬───────────────────┐ │ Path │ Location │ ├──────────────────────────────────────────────────┼───────────────────┤ │ .baoyu-skills/baoyu-image-gen/EXTEND.md │ Project directory │ ├──────────────────────────────────────────────────┼───────────────────┤ │ $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md │ User home │ └──────────────────────────────────────────────────┴───────────────────┘
┌───────────┬───────────────────────────────────────────────────────────────────────────┐ │ Result │ Action │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Found │ Read, parse, apply settings │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Not found │ Use defaults │ └───────────┴───────────────────────────────────────────────────────────────────────────┘
EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models
Schema: references/config/preferences-schema.md
Usage
# Basic
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google multimodal or OpenAI edits)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# Specific provider
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
Options
| Option | Description |
|---|---|
--prompt <text>, -p |
Prompt text |
--promptfiles <files...> |
Read prompt from files (concatenated) |
--image <path> |
Output image path (required) |
--provider google|openai|dashscope |
Force provider (default: google) |
--model <id>, -m |
Model ID (--ref with OpenAI requires GPT Image model, e.g. gpt-image-1.5) |
--ar <ratio> |
Aspect ratio (e.g., 16:9, 1:1, 4:3) |
--size <WxH> |
Size (e.g., 1024x1024) |
--quality normal|2k |
Quality preset (default: 2k) |
--imageSize 1K|2K|4K |
Image size for Google (default: from quality) |
--ref <files...> |
Reference images. Supported by Google multimodal and OpenAI edits (GPT Image models). If provider omitted: Google first, then OpenAI |
--n <count> |
Number of images |
--json |
JSON output |
Environment Variables
| Variable | Description |
|---|---|
OPENAI_API_KEY |
OpenAI API key |
GOOGLE_API_KEY |
Google API key |
DASHSCOPE_API_KEY |
DashScope API key (阿里云) |
OPENAI_IMAGE_MODEL |
OpenAI model override |
GOOGLE_IMAGE_MODEL |
Google model override |
DASHSCOPE_IMAGE_MODEL |
DashScope model override (default: z-image-turbo) |
OPENAI_BASE_URL |
Custom OpenAI endpoint |
GOOGLE_BASE_URL |
Custom Google endpoint |
DASHSCOPE_BASE_URL |
Custom DashScope endpoint |
Load Priority: CLI args > EXTEND.md > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env
Provider Selection
--refprovided + no--provider→ auto-select Google first, then OpenAI--providerspecified → use it (if--ref, must begoogleoropenai)- Only one API key available → use that provider
- Multiple available → default to Google
Quality Presets
| Preset | Google imageSize | OpenAI Size | Use Case |
|---|---|---|---|
normal |
1K | 1024px | Quick previews |
2k (default) |
2K | 2048px | Covers, illustrations, infographics |
Google imageSize: Can be overridden with --imageSize 1K|2K|4K
Aspect Ratios
Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1
- Google multimodal: uses
imageConfig.aspectRatio - Google Imagen: uses
aspectRatioparameter - OpenAI: maps to closest supported size
Generation Mode
Default: Sequential generation (one image at a time). This ensures stable output and easier debugging.
Parallel Generation: Only use when user explicitly requests parallel/concurrent generation.
| Mode | When to Use |
|---|---|
| Sequential (default) | Normal usage, single images, small batches |
| Parallel | User explicitly requests, large batches (10+) |
Parallel Settings (when requested):
| Setting | Value |
|---|---|
| Recommended concurrency | 4 subagents |
| Max concurrency | 8 subagents |
| Use case | Large batch generation when user requests parallel |
Agent Implementation (parallel mode only):
# Launch multiple generations in parallel using Task tool
# Each Task runs as background subagent with run_in_background=true
# Collect results via TaskOutput when all complete
Error Handling
- Missing API key → error with setup instructions
- Generation failure → auto-retry once
- Invalid aspect ratio → warning, proceed with default
- Reference images with unsupported provider/model → error with fix hint (switch to Google multimodal or OpenAI GPT Image edits)
Extension Support
Custom configurations via EXTEND.md. See Preferences section for paths and supported options.
More from zephyrwang6/myskill
web-scraper
Fetch and extract content from web pages, converting HTML to clean markdown. Use when users want to read web articles, extract information from URLs, scrape web content, or when the built-in WebFetch tool fails due to network restrictions. Trigger when user provides URLs to read, asks to fetch web content, or needs to extract text from websites.
246rss-aggregator
Aggregates and summarizes recent updates from a predefined list of RSS feeds. Use when the user asks for "recent updates", "what's new", or "RSS updates" within a specific timeframe.
196youtube-transcript-cn
|
108content-topic-generator
从文章、推文、社交媒体内容生成多角度选题,包括推文选题(140字完整内容)和公众号选题(含详细大纲)。支持延伸、反驳、扩充、热点结合四种策略。当用户需要基于现有内容创作新选题、分析文章生成衍生内容、或进行内容再创作时使用。适用场景:(1) 分析推文/文章并生成选题,(2) 创建公众号/社交媒体内容策划,(3) 将长文拆解为多个传播点,(4) 内容营销和话题策划。
99topic-collector
AI热点采集工具。从Twitter/X、Product Hunt、Reddit、Hacker News、博客等采集AI相关热点内容。当用户说"开始今日选题"、"采集热点"、"看看今天有什么新闻"、"今日AI热点"时触发。聚焦领域:Vibe Coding、Claude Skill、AI知识管理、AI模型更新、AI新产品、海外热点。
76topic-generator
AI选题生成工具。从采集的热点中筛选TOP10,生成完整选题方案。当用户说"生成选题"、"筛选热点"、"哪些值得写"时触发。输出包含:事件描述、核心角度、标题、写作方式。
71