replicate-cli

SKILL.md

Replicate CLI

The Replicate CLI is a command-line tool for interacting with Replicate's AI model platform. It enables running predictions, managing models, creating deployments, and fine-tuning models directly from the terminal.

Authentication

Before using the Replicate CLI, set the API token:

export REPLICATE_API_TOKEN=<token-from-replicate.com/account>

Alternatively, authenticate interactively:

replicate auth login

Verify authentication:

replicate account current

Core Commands

Running Predictions

The primary use case is running predictions against hosted models.

Basic prediction:

replicate run <owner/model> input_key=value

Examples:

Image generation:

replicate run stability-ai/sdxl prompt="a studio photo of a rainbow colored corgi"

Text generation with streaming:

replicate run meta/llama-2-70b-chat --stream prompt="Tell me a joke"

Prediction flags:

  • --stream - Stream output tokens in real-time (for text models)
  • --no-wait - Submit prediction without waiting for completion
  • --web - Open prediction in browser
  • --json - Output result as JSON
  • --save - Save outputs to local directory
  • --output-directory <dir> - Specify output directory (default: ./{prediction-id})

Seedance / Seedream Notes (Video Loops)

  • Seedream is image-only on Replicate (bytedance/seedream-3). It does not generate video.
  • Video models:
    • bytedance/seedance-1-lite → good for short loops; use duration=3, fps=24, camera_fixed=true.
    • bytedance/seedance-1.5-pro3s duration fails; use duration=5 minimum. Output is a URL.
  • Looping basics: set last_frame_image to the same image as image, and use camera_fixed=true to reduce camera motion.

CLI output quirks & fixes

  • replicate run ... --json can return [] even when the prediction succeeds.
    • Fix: poll predictions and fetch output URL:
      replicate prediction list --json
      replicate prediction show <id> --json
      
  • Seedance-1-lite often returns a data URI (data:video/mp4;base64,...) → base64‑decode to MP4.
  • Seedance-1.5-pro returns a direct URL in output → download with curl -L.

Input Handling

File uploads: Prefix local file paths with @:

replicate run nightmareai/real-esrgan image=@photo.jpg

Output chaining: Use {{.output}} template syntax to chain predictions:

replicate run stability-ai/sdxl prompt="a corgi" | \
replicate run nightmareai/real-esrgan image={{.output[0]}}

Model Operations

View model schema (see required inputs and outputs):

replicate model schema <owner/model>
replicate model schema stability-ai/sdxl --json

List models:

replicate model list
replicate model list --json

Show model details:

replicate model show <owner/model>

Create a new model:

replicate model create <owner/name> \
  --hardware gpu-a100-large \
  --private \
  --description "Model description"

Model creation flags:

  • --hardware <sku> - Hardware SKU (see references/hardware.md)
  • --private / --public - Visibility setting
  • --description <text> - Model description
  • --github-url <url> - Link to source repository
  • --license-url <url> - License information
  • --cover-image-url <url> - Cover image for model page

Training (Fine-tuning)

Fine-tune models using the training command:

replicate train <base-model> \
  --destination <owner/new-model> \
  input_key=value

Example - Fine-tune SDXL with DreamBooth:

replicate train stability-ai/sdxl \
  --destination myuser/custom-sdxl \
  --web \
  input_images=@training-images.zip \
  use_face_detection_instead=true

List trainings:

replicate training list

Show training details:

replicate training show <training-id>

Deployments

Deployments provide dedicated, always-on inference endpoints with predictable performance.

Create deployment:

replicate deployments create <name> \
  --model <owner/model> \
  --hardware <sku> \
  --min-instances 1 \
  --max-instances 3

Example:

replicate deployments create text-to-image \
  --model stability-ai/sdxl \
  --hardware gpu-a100-large \
  --min-instances 1 \
  --max-instances 5

Update deployment:

replicate deployments update <name> \
  --max-instances 10 \
  --version <version-id>

List deployments:

replicate deployments list

Show deployment details and schema:

replicate deployments show <name>
replicate deployments schema <name>

Hardware

List available hardware options:

replicate hardware list

See references/hardware.md for detailed hardware information and selection guidelines.

Scaffolding

Create a local development environment from an existing prediction:

replicate scaffold <prediction-id-or-url> --template=<node|python>

This generates a project with the prediction's model and inputs pre-configured.

Command Aliases

For convenience, these aliases are available:

Alias Equivalent Command
replicate run replicate prediction create
replicate stream replicate prediction create --stream
replicate train replicate training create

Short aliases for subcommands:

  • replicate m = replicate model
  • replicate p = replicate prediction
  • replicate t = replicate training
  • replicate d = replicate deployments
  • replicate hw = replicate hardware
  • replicate a = replicate account

Common Workflows

Image Generation Pipeline

Generate an image and upscale it:

replicate run stability-ai/sdxl \
  prompt="professional photo of a sunset" \
  negative_prompt="blurry, low quality" | \
replicate run nightmareai/real-esrgan \
  image={{.output[0]}} \
  --save

Check Model Inputs Before Running

Always check the model schema to understand required inputs:

replicate model schema owner/model-name

Batch Processing

Run predictions and save outputs:

for prompt in "cat" "dog" "bird"; do
  replicate run stability-ai/sdxl prompt="$prompt" --save --output-directory "./outputs/$prompt"
done

Monitor Long-Running Tasks

Submit without waiting, then check status:

# Submit
replicate run owner/model input=value --no-wait --json > prediction.json

# Check status later
replicate prediction show $(jq -r '.id' prediction.json)

Best Practices

  1. Always check schema first - Run replicate model schema <model> to understand required and optional inputs before running predictions.

  2. Use streaming for text models - Add --stream flag when running language models to see output in real-time.

  3. Save outputs explicitly - Use --save and --output-directory to organize prediction outputs.

  4. Use JSON output for automation - Add --json flag when parsing outputs programmatically.

  5. Open in web for debugging - Add --web flag to view predictions in the Replicate dashboard for detailed logs.

  6. Chain predictions efficiently - Use the {{.output}} syntax to pass outputs between models without intermediate saves.

Troubleshooting

Authentication errors:

  • Verify REPLICATE_API_TOKEN is set correctly
  • Run replicate account current to test authentication

Model not found:

  • Check model name format: owner/model-name
  • Verify model exists at replicate.com

Input validation errors:

  • Run replicate model schema <model> to see required inputs
  • Check input types (string, number, file)

File upload issues:

  • Ensure @ prefix is used for local files
  • Verify file path is correct and file exists

Additional Resources

Weekly Installs
2
GitHub Stars
8
First Seen
Feb 21, 2026
Installed on
opencode2
gemini-cli2
claude-code2
github-copilot2
codex2
kimi-cli2