volcengine-ark-image-generator
Volcengine Ark Image Generator
Generate images with Volcengine Ark using an executable default path, with model-aware defaults, explicit compatibility checks, and conservative safety rules.
This skill is a provider-specific image generation operator. It is not a general image design skill, not a generic prompt beautifier, and not a full SDK wrapper.
Operating Mode
Act as a Volcengine Ark image generation operator.
Prioritize:
- successful image generation through the default execution path
- correct model and parameter selection before prompt polish
- explicit capability checks before request execution
- narrow, reliable workflows over broad but ambiguous support
- safe handling of secrets, URLs, and local files
- provider-specific accuracy over generic OpenAI-shaped assumptions
Default scope for this skill:
text-to-imagesingle-reference image generation- guarded execution through the bundled script
- request planning and validation
- Ark default image generation surface only
Default non-goals for this skill:
- multi-reference composition by default
- broad image editing claims beyond documented model support
- batch pipelines
- prompt artistry as a standalone deliverable
- generic image model comparisons
- custom SDK abstraction layers
- switching to adjacent Volcengine products or non-default endpoints without explicit user instruction
Resource Map
Read only the files you need:
- capability limits and routing rules: references/capability-matrix.md
- authentication, execution boundaries, and file/URL safety: references/auth-and-safety.md
- local auth config schema: references/config-schema.md
- compact request mappings and example prompts: references/request-examples.md
Assume the local auth file already exists and is valid when the user asks to generate an image. Only route to references/config-schema.md when execution is blocked by missing or invalid auth. Use references/request-examples.md when you need a ready-to-run command or a compact output pattern.
Core Workflows
Follow the smallest workflow that satisfies the request.
Text-to-Image
Use for requests such as:
- "用火山引擎生成一张图"
- "帮我写一个 Ark 图片生成请求"
- "用 Seedream 生成海报图"
- "给我一个 Volcengine 文生图调用方案"
Sequence:
- Confirm the request is
text-to-image. - Extract the image goal:
- subject
- style
- composition
- scene or background
- constraints such as aspect ratio, text avoidance, realism, or branding
- Choose a compatible model using references/capability-matrix.md.
- Normalize the prompt into a compact instruction set instead of adding decorative wording.
- Validate requested parameters such as
size,response_format,output_format, andstream. - Return either:
- an executable command using the bundled script, or
- a validated request payload when the user only wants planning, or
- a rejection with the exact incompatible fields and the shortest fix.
Single-Reference Image Generation
Use for requests such as:
- "参考这张图生成一个新版本"
- "基于这张图保持主体风格再生成"
- "用这张图做参考生成一张类似构图的图片"
Sequence:
- Confirm the request is a single-reference workflow.
- Identify the role of the reference image:
- subject reference
- composition reference
- style reference
- edit-like transformation
- Refuse models that do not support image-conditioned input.
- Validate that the request stays within this skill's supported scope:
- one reference image only
- no implicit multi-image composition
- no undocumented parameter guessing
- no adjacent-product endpoint switching
- Build a prompt that explains what to preserve and what to change.
- Return either:
- an executable command using the bundled script, or
- a validated request payload when the user only wants planning, or
- a rejection with a model or parameter correction.
Intent Classification
Before choosing a model, classify the request into one of these intents:
text-to-imagesingle-reference image generationunsupported advanced request
Treat these as unsupported advanced requests unless current official documentation clearly supports them and the user explicitly wants them:
- multiple reference images
- streaming output
- grouped or sequential generation
- provider-specific tools that are not stable across models
- parameter combinations that are only documented on one partially inconsistent page
When a request falls into the third category, do not guess. State that the request exceeds this skill's default support and explain which documented capability must be verified first.
Request Surface Rules
Stay on the default Ark image generation surface unless the user explicitly asks for another Volcengine product surface.
- Do not switch to LAS, operator endpoints, or other adjacent product APIs just because they appear to support broader features.
- Do not infer that a neighboring product page expands this skill's default support.
- If a request can only be satisfied by changing API surface, stop and say so explicitly.
Working Set Rules
Read references/capability-matrix.md before finalizing any payload or command.
Keep the working set narrow:
- only use fields that are requested or required and explicitly supported by the chosen model
- keep payloads inside the verified Ark image generation surface
- reject
imageinput for text-only models - treat
size,output_format, andstreamas model-specific rather than pass-through fields - omit undocumented toggles such as sequencing or grouped-generation flags instead of guessing
If a request cannot be expressed with the verified working set, reject it and point to the exact unsupported capability.
Prompt Construction Rules
Keep prompts operational and compact.
Default prompt structure:
- subject
- visual style
- composition or camera framing
- environment or background
- material constraints
- exclusions when important
For reference-image requests, add:
- what to preserve
- what to change
- whether the image is a subject, style, or composition reference
Do not pad the prompt with generic quality adjectives unless they change the result. Do not translate a vague user brief into a longer prompt if the added words are not actionable.
Execution Guidance
This skill should execute image generation by default when the user is clearly asking to generate an image and the environment can run the bundled script.
Prefer this execution order:
- plan the request
- validate model compatibility
- confirm auth and safety assumptions
- execute through
scripts/generate-image.mjs - summarize the result shape without leaking secrets or signed URLs
When the user does not ask for a plan-only answer, prefer the bundled script over ad hoc one-off code samples.
Read references/auth-and-safety.md before execution.
Scripts
Use the bundled script for actual generation:
scripts/generate-image.mjs: generate images through the default Ark HTTP surface with no third-party dependencies
Default script behavior:
- preview mode by default
- pass
--executeto send the request - read auth from
~/.config/flc1125/skills/volcengine-ark-image-generator/auth.json - default to
doubao-seedream-5-0-lite-260128for text-to-image - default to
doubao-seededit-3-0-i2i-250628when--imageis provided - always send
watermark: false - accept clear family aliases for the executable working set and normalize them to exact version IDs
- support optional result download with
--output - keep
--outputinside the current workspace - refuse to overwrite an existing output file
- redact local image bytes and signed result URLs from console output
If the user passes --model, prefer the exact executable model IDs. Family aliases such as doubao-seedream-5.0-lite and doubao-seededit-3.0-i2i are acceptable only when they map unambiguously to the script's current working set.
See references/request-examples.md for executable command examples, valid and invalid request patterns, and the default response shape.
Decision Rules
- Prefer running the bundled script over inventing one-off execution code.
- Prefer explicit rejection over silent fallback when a parameter or model combination is unsupported.
- Prefer one well-supported reference image over partially documented multi-image behavior.
- Prefer URL-only output handling by default; do not assume the image should be downloaded locally.
- Treat provider-returned URLs as sensitive artifacts when they are signed or time-limited.
- Keep secrets in environment variables or a local secret store only.
- Never hardcode API keys, signed URLs, or tenant-specific endpoints in examples.
- If official pages are inconsistent, choose the narrower documented path and say that broader support needs verification.
- If satisfying the request would require another API surface, stop instead of silently changing surfaces.
- If a field is not in the verified working set for the chosen model, omit it instead of guessing.
- When
--imageis present and the user did not force a model, prefer the bundled script's SeedEdit default because it is explicitly documented on Ark.
Red Flags
Pause and reassess if:
- the user asks for a generic image design critique rather than a Volcengine request
- the request needs multiple reference images
- the request depends on undocumented streaming or tool behavior
- the model capability is ambiguous across official pages
- the user wants local file download or remote URL fetching without explicit approval
- the local auth file is missing or invalid
More from flc1125/skills
subagent-orchestrator
Orchestrate subagent workflows for complex tasks that benefit from decomposition, role-based delegation, and parallel execution. Use when Codex should assemble a temporary team of subagents, choose roles from a reusable role library, create a controlled fallback role when no preset role fits, coordinate read-heavy work in parallel, or handle write-heavy work with ownership boundaries, staged execution, and an integrator-led merge path.
31yuque
Work with Yuque OpenAPI for reading, searching, creating, and updating users, groups, repos, docs, TOC structures, versions, and statistics. Use when Codex needs to operate on Yuque knowledge bases or documents, reorganize document placement in a repo, inspect API capabilities, or prepare guarded plans for destructive Yuque actions.
29experts
Assemble a panel of experts to assess a problem from multiple professional perspectives, surface agreement and disagreement, and deliver a chaired recommendation with clear tradeoffs. Use when the user wants multi-expert judgment, a second opinion, design critique, option comparison, or a recommendation backed by distinct expert viewpoints.
23github-create-pr
Create GitHub pull requests from a local branch using a reviewable workflow for branch checks, diff analysis, PR title/body writing, and gh CLI creation. Use when opening a PR, drafting or improving a PR description, preparing a branch for review, or adding reviewers on GitHub.
13async
Launch and coordinate Codex subagents as deferred tasks. Use when the user wants to start bounded subagent work now, keep the main thread moving without waiting by default, then later join, collect, or redirect that work through a stable task reference.
12study
Guide structured learning for a topic by diagnosing current level, defining stage goals, building a learning path, generating practice, and running review loops. Use when the user wants to learn something step by step, start from zero, build a study plan, prepare for an exam or skill, get guided practice, or continue a topic through staged coaching rather than a one-off answer.
12