blog-illustration
Blog Illustration Prompt Generator
Generate prompts for image generation models (Gemini, Midjourney, DALL-E, etc.) that produce colorful cartoon-style infographic illustrations for blog posts.
The output is a text prompt — not an image. The user takes the prompt to their preferred image generation model.
The prompt itself must be written in English, regardless of the conversation language. Image models perform best with English prompts. Only use Chinese in the prompt when the user explicitly requests Chinese labels in the image.
Style DNA
Every illustration shares these visual traits. This is the non-negotiable foundation that keeps illustrations consistent across different blog posts.
- Cartoon infographic — a hybrid of illustration and information graphics. Not flat/minimal design, not formal architecture diagrams, not corporate clip art
- White background with soft pastel color-coded zones (rounded rectangles with subtle fills)
- Cute characters with personality — each abstract concept becomes a concrete visual metaphor. No generic robots, no gear icons, no floating screens with "AI" badges
- Simple faces — dots for eyes, simple expressions. Charming but not childish. Think Dropbox/Notion illustration style
- Thin rounded arrows in dark gray for flow and connections
- Clean white space between zones — leave room to breathe
- English labels by default — all text labels in the prompt (zone names, character names, flow annotations) use English. Image models render English reliably. Only use Chinese labels when the user explicitly requests it
- 16:9 aspect ratio, high quality, clean edges, no blur, no gradients
- Soft pastel palette: baby blue, lavender/pink, warm cream/yellow, sage green. Adjust per piece but stay in this family
Process
1. Analyze the content
Read the text that needs illustration. Identify:
- Components: The key actors, concepts, or stages (3-6 is ideal for one illustration; more than 6 means you should suggest splitting into multiple images)
- Relationships: How do they connect? Sequential flow, hierarchy, cycle, hub-and-spoke, or loose association?
- Groupings: Are there natural clusters or layers?
- The one thing: What single idea should a reader grasp at a glance, before reading any labels?
2. Design character metaphors
This is the most important step. Each abstract component needs a concrete visual form that hints at its function.
Principles:
- Function drives form. A component that connects things → spider weaving silk. A component that cleans/audits → gardener pruning branches. A component that observes patterns → owl. A component that wanders freely → firefly.
- Visual distinctness. Every character should be immediately distinguishable in silhouette. If two characters both look like "small robot doing X," the illustration fails.
- No generic defaults. Never fall back to "a robot," "a gear," "a monitor with code," or "a person at a desk with sparkles." Push for a specific metaphor that carries meaning.
- Present options. When the best metaphor isn't obvious, offer 2-3 alternatives with brief reasoning so the user can choose.
Reference examples:
| Function | Weak | Strong |
|---|---|---|
| Links/connects items | Robot with wires | Spider weaving silk between cards |
| Audits/cleans/maintains | Robot with magnifying glass | Gardener pruning dead branches |
| Generates profile from behavioral data | Brain with arrows | Painter creating portrait from scattered fragments |
| Equal partnership | Two robots | Two silhouettes back-to-back, one human-shaped, one geometric |
| Free exploration with occasional output | Floating robot | Firefly drifting lazily, glowing when it finds something |
| Filters or guards | Shield icon | Cat sitting on a fence, letting some things pass |
| Schedules or orchestrates | Clock icon | Conductor with a baton, cueing different performers |
3. Plan the layout
Choose a layout pattern based on the relationship structure:
- Z-flow (top-left → top-right → bottom-left → bottom-right): For sequential processes with 3-4 stages. Follows natural reading direction.
- Hub-and-spoke: For a central concept with multiple related elements radiating outward.
- Layered bands (horizontal or vertical): For systems with distinct tiers or phases.
- Scattered/organic: For loosely related elements. Use sparingly — it's easy to look messy.
If an element deliberately breaks the pattern (e.g., something autonomous that doesn't fit the main structure), position it outside the organized zones — floating, slightly translucent, with dashed connections. This visual separation communicates "this one is different" without explanation.
4. Assign colors and zones
- Each logical group gets a distinct soft pastel zone (rounded rectangle)
- Use color to reinforce grouping, not for decoration
- 3-4 zone colors maximum. More than that becomes visual noise
- Special or anomalous elements: desaturated, translucent, or no zone background at all
5. Write text labels
Before writing the prompt, list all text labels that should appear in the image. Default to English for everything — image models render English reliably.
- Zone labels: e.g. "Maintenance", "Insight", "Output"
- Character labels: e.g. "Weaver (daily)", "Sentinel (weekly)"
- Flow annotations: e.g. "changes", "approve", "reject"
- Key elements: e.g. "Notes Vault", "Profile"
Keep labels short — 2-4 words per label. Only use Chinese labels when the user explicitly asks for them.
6. Write the prompt
Structure the prompt as:
- Style declaration (1 sentence) — establish the overall look and purpose
- Layout overview (1-2 sentences) — spatial arrangement and reading flow
- Zone-by-zone description — for each zone: background color, characters present, what they're doing, key visual details. Embed labels naturally: "a soft blue zone labeled 'Maintenance' in the top corner"
- Special elements — anything floating, detached, or breaking the main structure
- Text and label placement — explicitly list all Chinese text labels and where they appear. Be specific about placement (above, below, inside) so the model doesn't scatter them randomly
- Style details block — bullet list of specific requirements (palette, line weight, character style, mood, what NOT to include)
- Technical specs — aspect ratio, quality
Keep the prompt 200-400 words. Image models perform worse with extremely long prompts — be specific about what matters, brief about the rest.
Things to avoid
- Labels that are too long. Keep each label to 2-4 words. Longer text is harder for models to render cleanly. If a concept needs more explanation, that's what the article text is for — the image just needs a short label.
- Generic characters. Two "cute robots" in the same illustration is a design failure. Every character needs its own metaphor.
- "AI" badges or labels on characters. If the context already makes clear these are AI components, badges add nothing and look tacky.
- Formal diagram conventions. No UML, no swimlanes, no database cylinders. This is an illustration, not a spec sheet.
- Overloaded compositions. More than 6 distinct elements in one image = suggest splitting into multiple illustrations.
- Decorative gradients or 3D effects. Stay flat and clean. Depth comes from overlapping elements and subtle shadows, not from glossy renders.