generate-design by sivi/skills

What This Skill Does

Calls the Sivi Core API to generate design assets such as ads, banner, social posts, thumbnails, and more from a natural language prompt, then fetches and displays the generated variants with preview images and edit links.

⚠️ Cross-Platform Compatibility — MANDATORY

This skill runs on macOS, Linux, or Windows (Git Bash / WSL). You MUST follow these rules in ALL bash scripts to ensure they work on every platform:

NEVER use head -n -1 — macOS BSD head does not support negative line counts. This WILL error with head: illegal line count -- -1.
NEVER use tail -n +2 with pipes to strip headers unless you are certain of the format.
To separate HTTP status code from response body, ALWAYS use curl's -o flag to write the body to a file and -w '%{http_code}' to capture the status code separately. This is the ONLY safe cross-platform approach.
Use python3 for JSON parsing instead of jq (which may not be installed). Available on macOS, Linux, and Windows (Git Bash / WSL).
Temp file paths: Use /tmp/ on macOS/Linux. On Windows Git Bash, /tmp/ maps to a valid temp directory automatically.

Correct pattern (MUST use this):

HTTP_CODE=$(curl -s -o /tmp/sivi_response.json -w '%{http_code}' \
  -X POST "https://example.com/api" \
  -H "Content-Type: application/json" \
  -H "sivi-api-key: $SIVI_API_KEY" \
  -d "$PAYLOAD")
BODY=$(cat /tmp/sivi_response.json)

WRONG patterns (NEVER use these):

# ❌ BROKEN on macOS:
BODY=$(echo "$RESPONSE" | head -n -1)
# ❌ BROKEN on macOS:
RESPONSE=$(curl -s -w '\n%{http_code}' ...); BODY=$(echo "$RESPONSE" | head -n -1)

Steps

Workflow note: This skill involves sequential bash script calls and file-reading steps. Proceed through steps without unnecessary confirmation prompts between them, but always respect the platform's command approval settings. Only auto-run commands if the user has explicitly enabled auto-run permissions in their IDE/agent settings.

Parse arguments from $ARGUMENTS:
- prompt — the design brief (required). You may enhance or rephrase the user's prompt (e.g., make it more descriptive for better results), but you MUST preserve every piece of information the user provided — headlines, descriptions, button text, brand names, colors, URLs, dimensions, and any other details. Never drop, summarize away, or omit anything the user explicitly stated.
- type — primary design category — default: instagram
- subtype — format variant — default: instagram-post
- dimension — {width, height} in pixels — only required when type is custom; both values must be between 200 and 2000. If the user provides values outside this range, inform them and ask them to correct it before proceeding.
- assets — object with images/logos (optional):
  - images — array of objects: { "url": "...", "imagePreference": { "crop": true, "removeBg": false } }
  - logos — array of objects: { "url": "...", "logoStyles": ["direct", "outline"] }
- numOfVariants — number of variants (1–4) — default: 2
- outputFormat — array of formats (allowed values: jpg, png),
- language — language for text elements — default: english (lower case)
- settings — object with design preferences (optional):
  - mode — design preference mode (allowed values: auto, brand, custom). Default: auto. Set to custom only when the user provides at least one setting below (colors, theme, frameStyle, backdropStyle, focus, imageStyle, or fontGroups). In custom mode, the API uses the preferences below. In auto mode, Sivi picks preferences automatically. In brand mode, preferences come from the brand persona (requires currentbId).
  - colors — array of hex codes (e.g. ["#FF0000", "#FFFFFF"]) — only used in custom mode
  - theme — array of theme preferences (allowed values: light, dark, or colorful) — only used in custom mode
  - frameStyle — array of frame style preferences (allowed values: plain, box, bar-accent) — only used in custom mode
  - backdropStyle — array of backdrop style preferences (allowed values: minimalist, imagery, artistic) — only used in custom mode
  - focus — array of focus preferences (allowed values: text, image, neutral) — only used in custom mode
  - imageStyle — array of image style preferences (allowed values: cover, cover-with-linear-gradient, cover-with-overlay, container, section, section-with-container, mask, cutout, cutout-with-vectors, content-free-form) — only used in custom mode
  - fontGroups — array of font objects: { "id": "...", "name": "...", "type": "heading|subHeading|body", "status": "enabled", "addedBy": "system" }
If prompt is missing, ask the user: "What should the design be about? Please describe it briefly."
Handle assets — Collect assets from TWO sources: user-uploaded file attachments AND image URLs found in the prompt text.

Source 1: Uploaded/attached files — If the user has attached or uploaded any files in the chat. Source 2: URLs in the prompt — Scan the prompt text for any image URLs (e.g., https://example.com/photo.jpg, https://cdn.site.com/logo.png). Extract these URLs from the prompt before sending it to the API. Remove the URLs from the prompt text so only the descriptive text remains.

For each asset (from either source), analyse and classify it:
- Logo: If the asset is a logo, icon, brand mark, or monogram — add to assets.logos[] as { "url": "<url>", "logoStyles": ["direct", "outline"] }
- Image: If the asset is a photo, illustration, product shot, screenshot, or any non-logo visual — add to assets.images[] as { "url": "<url>", "imagePreference": { "crop": true, "removeBg": false } }
Classification rules:
- Look at the file/URL content: logos are typically vector-like, transparent background, simple shapes, or contain brand text. Photos/illustrations are richer, more complex imagery.
- If the URL or filename contains words like "logo", "icon", "brand", "mark" — classify as logo.
- If unsure, default to image (not logo).
- Maximum 5 assets total (images + logos combined). If the user provides more than 5, use the first 5 and inform them of the limit.
⚠️ IMPORTANT — Asset URLs must be publicly accessible: The Sivi API only accepts publicly accessible image URLs (e.g., https://example.com/logo.png). If the user uploads or attaches a local file in the chat, do NOT attempt to use the local file path. Instead, ask the user to provide a publicly hosted URL for the image. Example response:

"I can see you've uploaded an image, but the Sivi API requires a publicly accessible URL (like a link from your website, CDN, or image hosting service). Could you share the public URL for this image?"

Step A — Submit the design (Bash tool call #1). Never use WebFetch (it cannot send custom headers and will always return 401).

Use this EXACT script template (fill in the JSON payload fields from parsed arguments):

#!/bin/bash
set -e
# ✅ Replace <SKILL_DIR> with the actual absolute path to this skill's directory
source <SKILL_DIR>/.env

# Build payload using heredoc (safe for special characters in prompt)
PAYLOAD=$(cat <<'ENDJSON'
{
  "prompt": "<PROMPT_TEXT>",
  "type": "<TYPE>",
  "subtype": "<SUBTYPE>",
  "dimension": { "width": <WIDTH_OR_null>, "height": <HEIGHT_OR_null> },
  "numOfVariants": <NUM>,
  "outputFormat": ["jpg"],
  "language": "<LANGUAGE>",
  "assets": {
    "images": <IMAGES_ARRAY_OR_[]>,
    "logos": <LOGOS_ARRAY_OR_[]>
  },
  "settings": {
    "mode": "<auto_OR_custom>",
    "colors": <COLORS_ARRAY_OR_[]>,
    "theme": <THEME_ARRAY_OR_[]>,
    "frameStyle": <FRAMESTYLE_ARRAY_OR_[]>,
    "backdropStyle": <BACKDROPSTYLE_ARRAY_OR_[]>,
    "focus": <FOCUS_ARRAY_OR_[]>,
    "imageStyle": <IMAGESTYLE_ARRAY_OR_[]>,
    "fontGroups": <FONTGROUPS_ARRAY_OR_[]>
  }
}
ENDJSON
)

# ✅ Cross-platform safe: use -o to write body to file, -w to capture status code
HTTP_CODE=$(curl -s -o /tmp/sivi_submit_response.json -w '%{http_code}' \
  -X POST "https://connect.sivi.ai/api/prod/v2/general/designs-from-prompt" \
  -H "Content-Type: application/json" \
  -H "sivi-api-key: $SIVI_API_KEY" \
  -d "$PAYLOAD")

BODY=$(cat /tmp/sivi_submit_response.json)

if [ "$HTTP_CODE" != "200" ]; then
  echo "ERROR: HTTP $HTTP_CODE"
  echo "$BODY"
  exit 1
fi

# Parse designId and requestId using python3
DESIGN_ID=$(python3 -c "import json,sys; d=json.load(sys.stdin); print(d['body']['designId'])" <<< "$BODY")
REQUEST_ID=$(python3 -c "import json,sys; d=json.load(sys.stdin); print(d['body']['requestId'])" <<< "$BODY")

echo "DESIGN_ID=$DESIGN_ID"
echo "REQUEST_ID=$REQUEST_ID"

Important: When the prompt contains single quotes, double quotes, or other special characters, use the heredoc with <<'ENDJSON' (quoted delimiter prevents variable expansion inside). If the prompt itself must be injected dynamically, use python3 to build the JSON:

PROMPT_TEXT="user's prompt here"
PAYLOAD=$(python3 -c "
import json
print(json.dumps({
  'prompt': '''$PROMPT_TEXT''',
  # ... rest of fields
}))
")

Actually, the safest approach for dynamic prompts is:

PAYLOAD=$(python3 << 'PYEOF'
import json, sys
data = {
    "prompt": """PROMPT_PLACEHOLDER""",
    "type": "instagram",
    "subtype": "instagram-post",
    "dimension": {"width": None, "height": None},
    "numOfVariants": 2,
    "outputFormat": ["jpg"],
    "language": "english",
    "assets": {"images": [], "logos": []},
    "settings": {
        "mode": "auto",
        "colors": [],
        "theme": [],
        "frameStyle": [],
        "backdropStyle": [],
        "focus": [],
        "imageStyle": [],
        "fontGroups": []
    }
}
print(json.dumps(data))
PYEOF
)

Replace PROMPT_PLACEHOLDER with the actual prompt text. Python's json.dumps handles all escaping correctly.

This call returns in ~2 seconds. After it returns, immediately tell the user that the design is being generated and show the designId and requestId before proceeding to the next step.

Step B — Poll and download variants (Bash tool call #2).

Use this EXACT script template:

#!/bin/bash
set -e
# ✅ Replace <SKILL_DIR> with the actual absolute path to this skill's directory
source <SKILL_DIR>/.env

# ✅ Replace <PROMPT_SLUG> with a file-safe 1-2 word slugified 
# version of the user's prompt (e.g., "coffee-ad")
PREFIX="<PROMPT_SLUG>"
# Ensure PREFIX has no spaces (replace with hyphens)
PREFIX=$(echo "$PREFIX" | tr ' ' '-')

DESIGN_ID="<DESIGN_ID_FROM_STEP_A>"
MAX_ATTEMPTS=20
ATTEMPT=0

while [ $ATTEMPT -lt $MAX_ATTEMPTS ]; do
  ATTEMPT=$((ATTEMPT + 1))
  echo "Poll attempt $ATTEMPT/$MAX_ATTEMPTS..."

  # ✅ Cross-platform safe: use -o to write body to file, -w to capture status code
  HTTP_CODE=$(curl -s -o /tmp/sivi_poll_response.json -w '%{http_code}' \
    -X POST "https://connect.sivi.ai/api/prod/v2/general/get-design-variants" \
    -H "Content-Type: application/json" \
    -H "sivi-api-key: $SIVI_API_KEY" \
    -d "{\"designId\":\"$DESIGN_ID\"}")

  BODY=$(cat /tmp/sivi_poll_response.json)

  if [ "$HTTP_CODE" != "200" ]; then
    echo "ERROR: HTTP $HTTP_CODE"
    echo "$BODY"
    exit 1
  fi

  # Check if variations array is non-empty using python3
  VARIANT_COUNT=$(python3 -c "
import json, sys
d = json.load(sys.stdin)
variations = d.get('body', {}).get('variations', [])
print(len(variations))
" <<< "$BODY")

  if [ "$VARIANT_COUNT" -gt 0 ]; then
    echo "Found $VARIANT_COUNT variants!"

    # Extract and download each variant using python3
    python3 -c "
import json, sys
d = json.load(sys.stdin)
variations = d.get('body', {}).get('variations', [])
for i, v in enumerate(variations, 1):
    print(f\"VARIANT_{i}_URL={v.get('variantImageUrl', '')}\")
    print(f\"VARIANT_{i}_EDIT={v.get('variantEditLink', '')}\")
    print(f\"VARIANT_{i}_ID={v.get('variantId', '')}\")
    print(f\"VARIANT_{i}_WIDTH={v.get('variantWidth', '')}\")
    print(f\"VARIANT_{i}_HEIGHT={v.get('variantHeight', '')}\")
    print(f\"VARIANT_{i}_TYPE={v.get('variantType', '')}\")
" <<< "$BODY" > /tmp/sivi_variants_info.txt

    cat /tmp/sivi_variants_info.txt

    # Download each variant image
    OUTPUT_DIR="<SKILL_DIR>/generated-designs"
    mkdir -p "$OUTPUT_DIR"
    IDX=1
    while [ $IDX -le $VARIANT_COUNT ]; do
      IMG_URL=$(grep "VARIANT_${IDX}_URL=" /tmp/sivi_variants_info.txt | cut -d'=' -f2-)
      if [ -n "$IMG_URL" ]; then
        # Validate URL starts with https:// before using it
        if [[ "$IMG_URL" != https://* ]]; then
          echo "SKIP: VARIANT_${IDX} URL is not https, skipping for security."
          IDX=$((IDX + 1))
          continue
        fi
        # Extract unique ID from the end of the URL (filename without extension)
        # Sanitize URL_ID to only allow alphanumeric, hyphens, underscores
        URL_ID=$(basename "$IMG_URL" | sed 's/\.[^.]*$//' | tr -cd '[:alnum:]-_')
        FILE_NAME="${PREFIX}_${URL_ID}_v${IDX}.jpg"
        
        curl -sL -o "$OUTPUT_DIR/${FILE_NAME}" "$IMG_URL"
        echo "VARIANT_${IDX}_IMG=$OUTPUT_DIR/${FILE_NAME}"
      fi
      IDX=$((IDX + 1))
    done

    exit 0
  fi

  sleep 15
done

echo "TIMEOUT"
exit 1

Handle errors from either script's output:
- On 401: Tell the user their SIVI_API_KEY is missing or invalid
- On 402: Tell the user they have insufficient Sivi credits
- On 422: Tell the user which input parameter is invalid and ask them to correct it
- On 500: Tell the user the Sivi server errored and suggest retrying
Display results — after Step B completes, parse its structured output. For EACH variant, display these items in this exact order. You MUST display the results immediately upon completion. Do NOT ask the user for permission.

A) Read the image file: Use your file-reading tool (e.g., view_file in Antigravity or Read in Claude Code) on the downloaded .jpg file. This allows you to visually analyze the generated design.

B) Render the image inline using markdown with the EXPANDED ABSOLUTE file path:
```
![Variant 1](</Users/.../generate-design/generated-designs/<PREFIX>_<URL_ID>_v1.jpg>)
```
CRITICAL MUST-DO: You MUST use the exact absolute file path output by the script in VARIANT_N_IMG.
- DO NOT output the literal text [Image] or [Local Image] instead of the actual image markdown.
- You MUST write the complete markdown syntax ![Variant N](/path/to/file.jpg).
- If the absolute path contains spaces, you MUST enclose the path in angle brackets: ![Variant N](</path/with spaces/file.jpg>) or URL-encode the spaces to %20. Do not literally type <SKILL_DIR>.
C) Print the design size: Design size: <variantWidth> x <variantHeight>

D) Print the preview and edit links: Combine the URLs into a single line like this: [Preview this design](<variantImageUrl>) | [Edit this design](<variantEditLink>)

Required output — follow this EXACTLY:
```
**Variant 1**

>>> Read /Users/.../<PREFIX>_<URL_ID>_v1.jpg (tool call) <<<

![Variant 1](/Users/.../<PREFIX>_<URL_ID>_v1.jpg)

Design size: <variantWidth> x <variantHeight>

[Preview this design](<variantImageUrl>) | [Edit this design](<variantEditLink>)

---

**Variant 2**

>>> Read /Users/.../<PREFIX>_<URL_ID>_v2.jpg (tool call) <<<

![Variant 2](/Users/.../<PREFIX>_<URL_ID>_v2.jpg)

Design size: <variantWidth> x <variantHeight>

[Preview this design](<variantImageUrl>) | [Edit this design](<variantEditLink>)

---

**Summary:** (Write 1-2 sentences overall summarizing the generated designs based on what you saw)
```
⚠️ CRITICAL RULES:
- Display results IMMEDIATELY. Do not pause to ask if the user wants to see them.
- Use ![Variant N](</absolute/path/...jpg>) with the RESOLVED ABSOLUTE file path. If you emit the string <SKILL_DIR>, the image breaks!
- NEVER output just the text [Image] or [Local Image] — always write the full markdown image format ![]().
- NEVER put the remote variantImageUrl inside ![]() — it will NOT render.
- You MUST read the image files with your tool so you can summarize them, AND you MUST output the markdown tag so the user can see them! Both steps are required.

Available Design Types & Subtypes

Type	Subtypes
`instagram`	`instagram-post`, `instagram-post-small`, `instagram-ad`, `instagram-story`
`facebook`	`facebook-post`, `facebook-ad`, `facebook-cover`
`twitter`	`twitter-post`, `twitter-ad`, `twitter-cover`
`linkedin`	`linkedIn-post`, `linkedIn-ad`, `linkedIn-cover`, `linkedIn-banner`
`pinterest`	`pinterest-pin-small`, `pinterest-std-pin`
`whatsapp`	`whatsapp-post`, `whatsapp-wide-post`, `whatsapp-status`, `whatsapp-business-cover`
`youtube`	`youtube-thumbnail`
`displayAds`	`displayAds-half-page-ad`, `displayAds-large-rectangle`, `displayAds-inline-rectangle`, `displayAds-square`, `displayAds-small-square`, `displayAds-skyscraper`, `displayAds-fat-skyscraper`, `displayAds-wide-skyscraper`, `displayAds-small-skyscraper`, `displayAds-leaderboard`, `displayAds-mobile-leaderboard`, `displayAds-large-leaderboard`, `displayAds-banner`
`amazon`	`amazon-ad`, `amazon-one-third`, `amazon-fullscreen`, `amazon-large-square`, `amazon-rectangle`, `amazon-standard`, `amazon-square`, `amazon-small-square`, `amazon-fullscreen-HD`
`website`	`website-large-rectangle`, `website-medium-rectangle`, `website-rectangle`, `website-wide-rectangle`, `website-tall-rectangle`, `website-square`, `website-small-square`, `website-large-square`, `website-fullscreen-HD`, `website-half-page`, `website-one-third`, `website-standard`, `website-hello-bar`
`email`	`square`, `tall`, `rectangle`, `wide`, `small`, `small-square`
`custom`	`custom` (requires `dimension: {width, height}`)

Security

API key: $SIVI_API_KEY is loaded from a local .env file at runtime. It is never hardcoded in scripts or committed to version control. The key is only sent to the Sivi API endpoint (connect.sivi.ai) — never to any other host.
Outbound requests: Scripts only make HTTPS requests to connect.sivi.ai. No other outbound endpoints are contacted for API calls.
Download validation: Variant images are downloaded only from URLs returned by the Sivi API. The download script validates that each URL starts with https:// before fetching. Downloads are written to a local generated-designs/ directory within the skill folder.
Asset URLs: Only URLs that the user explicitly provides as image or logo assets are included in the API payload. URLs are sent to the Sivi API solely for design generation purposes.
Temp files: Intermediate API responses are written to /tmp/ and are not persisted beyond the script execution.

Notes

Never call the API without a prompt. Always collect it first.
Prompt fidelity: Enhancing or rephrasing the user's prompt is acceptable, but all user-provided details (headlines, descriptions, button text, brand names, specific wording, etc.) must appear in the prompt sent to the API. Missing information is a bug.
Default numOfVariants is 2. Never exceed 4.
Never hardcode the API key — always use $SIVI_API_KEY (sourced from .env if needed).
If the user doesn't specify type/subtype, auto-pick one type/subtype that best matches the prompt.
Always send all fields in the request body. Use [] for unprovided array fields, null for dimension values when not "custom".
settings.mode must always be set. Default to "auto". Set to "custom" only when the user provides at least one style preference (colors, theme, frameStyle, backdropStyle, focus, imageStyle, or fontGroups). Use "brand" if the user provides a brand ID.
outputFormat is an array (e.g. ["jpg"]), not a string.
assets.logos items must be objects: { "url": "...", "logoStyles": ["direct", "outline"] } — never plain URL strings.
assets.images items must be objects: { "url": "...", "imagePreference": { "crop": true, "removeBg": false } } — never plain URL strings.
settings.fontGroups is an array of font objects with id, name, type, status, addedBy — not a flat array of strings.
Set dimension.width and dimension.height only when type is "custom"; otherwise pass null for both. Both values must be between 200 and 2000 (inclusive). If out of range, do not call the API — ask the user to correct the values first.
⚠️ To display images inline, use markdown with the LOCAL file path: ![Variant N](/absolute/path/to/..._vN.jpg) (use the exact path from VARIANT_N_IMG). NEVER use the remote variantImageUrl in markdown text — it will not render. The remote URL is only used by the bash script to download the file. You must ALSO read the local file using your file-reading tool to see the design and write a summary.
Always display design size as Design size: <variantWidth> x <variantHeight> below each variant image.
Variants response is nested: response.body.variations[] with variantImageUrl, variantEditLink, variantId, variantWidth, variantHeight, variantType per item.
The linkedIn type with subtype linkedIn-post is not supported by the API — use instagram / instagram-post as the default for professional/social content unless the user specifies otherwise.
⚠️ NEVER use head -n -1 anywhere. It does not work on macOS. Always use curl -s -o <file> -w '%{http_code}' to separate body from status code.
⚠️ NEVER use jq — it may not be installed. Use python3 for all JSON parsing.