skills/cloudai-x/world-labs-skills/world-labs-image-prompt

world-labs-image-prompt

SKILL.md

World Labs Single Image Input

Generate 3D worlds from a single reference image. The model extrapolates the scene to create an immersive 360° environment.

Quick Reference

Requirement Specification
Recommended resolution 1024px on long side
Maximum file size 20 MB
Formats PNG (recommended), JPG, WebP
Aspect ratio 16:9, 9:16, or anything in between

Credits

Input Type Marble 0.1-plus Marble 0.1-mini
Standard image 1,580 230
Panorama image 1,500 150

Panoramas skip the pano generation step, saving 80 credits.

Best Practices

DO

Clear spatial definition: Images with obvious depth and perspective ✅ Wide shots: Show foreground, midground, and background ✅ Visible ground/floor: Helps establish world orientation ✅ Consistent lighting: Clear, well-lit scenes ✅ Environmental scenes: Landscapes, interiors, architectural spaces

DON'T

Close-up shots: Lack spatial context for 3D reconstruction ❌ People or animals: As main subjects (may cause artifacts) ❌ Abstract images: Non-representational art ❌ Borders or frames: Decorative edges, watermarks ❌ Heavy text overlays: Text doesn't render clearly ❌ Flat graphics: 2D illustrations without depth ❌ Blurry or low-contrast images: Reduces detail inference

Ideal Image Types

Excellent Results

  1. Landscape photography - Wide vistas with clear horizon and natural depth
  2. Architectural interiors - Rooms with visible floor/walls/ceiling
  3. Street scenes - Urban environments with buildings receding into distance
  4. Natural environments - Forests, caves, beaches with organic depth

Challenging (Use with Caution)

  • Indoor scenes with complex reflections
  • Very dark or overexposed images
  • Images with motion blur
  • Heavily edited/filtered photos

API Usage

Option 1: From Uploaded Media Asset

First upload your image (see world-labs-api skill), then:

{
  "model": "Marble 0.1-plus",
  "world_prompt": {
    "type": "image",
    "image_prompt": {
      "source": "media_asset",
      "media_asset_id": "550e8400-e29b-41d4-a716-446655440000"
    },
    "text_prompt": "Optional description to guide interpretation"
  }
}

Option 2: From Public URL

{
  "model": "Marble 0.1-plus",
  "world_prompt": {
    "type": "image",
    "image_prompt": {
      "source": "uri",
      "uri": "https://example.com/my-image.jpg"
    },
    "text_prompt": "A beautiful mountain landscape"
  }
}

Panorama Images (use is_pano flag)

For 360° equirectangular panoramas (2:1 aspect ratio):

{
  "model": "Marble 0.1-plus",
  "world_prompt": {
    "type": "image",
    "image_prompt": {
      "source": "media_asset",
      "media_asset_id": "550e8400-e29b-41d4-a716-446655440000",
      "is_pano": true
    }
  }
}

Upload Workflow

Step 1: Prepare Upload

curl -X POST "https://api.worldlabs.ai/marble/v1/media-assets:prepare_upload" \
  -H "WLT-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_name": "landscape.jpg",
    "kind": "image",
    "extension": "jpg"
  }'

Response:

{
  "media_asset": {
    "media_asset_id": "550e8400-e29b-41d4-a716-446655440000"
  },
  "upload_info": {
    "upload_url": "https://storage.googleapis.com/...",
    "upload_method": "PUT",
    "required_headers": {
      "x-goog-content-length-range": "0,1048576000"
    }
  }
}

Step 2: Upload Image

curl -X PUT "UPLOAD_URL" \
  -H "Content-Type: image/jpeg" \
  -H "x-goog-content-length-range: 0,1048576000" \
  --data-binary @landscape.jpg

Step 3: Generate World

curl -X POST "https://api.worldlabs.ai/marble/v1/worlds:generate" \
  -H "WLT-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Marble 0.1-plus",
    "world_prompt": {
      "type": "image",
      "image_prompt": {
        "source": "media_asset",
        "media_asset_id": "550e8400-e29b-41d4-a716-446655440000"
      },
      "text_prompt": "A dramatic mountain landscape at sunset"
    }
  }'

Python Example

import requests

def upload_image_and_generate(image_path: str, api_key: str, prompt: str = None, is_pano: bool = False):
    base_url = "https://api.worldlabs.ai/marble/v1"
    headers = {"WLT-Api-Key": api_key, "Content-Type": "application/json"}

    # Get extension
    ext = image_path.lower().split('.')[-1]
    if ext == "jpeg":
        ext = "jpg"

    # Step 1: Prepare upload
    prep_response = requests.post(
        f"{base_url}/media-assets:prepare_upload",
        headers=headers,
        json={"file_name": image_path.split('/')[-1], "kind": "image", "extension": ext}
    )
    prep_data = prep_response.json()
    media_asset_id = prep_data["media_asset"]["media_asset_id"]
    upload_url = prep_data["upload_info"]["upload_url"]

    # Step 2: Upload image
    content_types = {"jpg": "image/jpeg", "png": "image/png", "webp": "image/webp"}
    with open(image_path, 'rb') as f:
        requests.put(
            upload_url,
            headers={"Content-Type": content_types.get(ext, "image/jpeg")},
            data=f.read()
        )

    # Step 3: Generate world
    image_prompt = {"source": "media_asset", "media_asset_id": media_asset_id}
    if is_pano:
        image_prompt["is_pano"] = True

    world_prompt = {"type": "image", "image_prompt": image_prompt}
    if prompt:
        world_prompt["text_prompt"] = prompt

    gen_response = requests.post(
        f"{base_url}/worlds:generate",
        headers=headers,
        json={"model": "Marble 0.1-plus", "world_prompt": world_prompt}
    )

    return gen_response.json()["operation_id"]

# Usage
operation_id = upload_image_and_generate(
    "mountain_vista.jpg",
    "your_api_key",
    "Add dramatic storm clouds and golden sunset light"
)

Combining Text with Images

Text prompts guide interpretation and can transform the scene:

Text Prompt Effect
None (omitted) Auto-caption generated, faithful recreation
"At sunset" Changes lighting/atmosphere
"In winter with snow" Adds seasonal elements
"Abandoned and overgrown" Adds decay/nature reclaim
"Futuristic version" Sci-fi transformation

When text is omitted, the model auto-generates a caption from the image.

Troubleshooting

Issue Solution
Distorted geometry Use image with clearer depth cues
Missing areas Provide image with more context/edges
Wrong scale Include recognizable objects for reference
Artifacts on faces Avoid images with people as main subject
Billboard warping Objects far from center may warp; center important elements

Related Skills

  • world-labs-api - API integration details
  • world-labs-text-prompt - Text prompting best practices
  • world-labs-multi-image - Using multiple images with direction control
  • world-labs-pano-video - Panorama and video input
Weekly Installs
1
Installed on
windsurf1
amp1
trae1
opencode1
cursor1
droid1