mmx-cli

Installation
SKILL.md

MiniMax CLI — Agent Skill Guide

Use mmx to generate text, images, video, speech, music, and perform web search via the MiniMax AI platform.

Prerequisites

# Auth (persisted to ~/.mmx/credentials.json)
mmx auth login --api-key sk-xxxxx

# Or pass per-call
mmx text chat --api-key sk-xxxxx --message "Hello"

Region is auto-detected. Override with --region global or --region cn.


Agent Flags

Always use these flags in non-interactive (agent/CI) contexts:

Flag Purpose
--non-interactive Fail fast on missing args instead of prompting
--quiet Suppress spinners/progress; stdout is pure data
--output json Machine-readable JSON output
--async Return task ID immediately (video generation)
--dry-run Preview the API request without executing
--yes Skip confirmation prompts

Commands

text chat

Chat completion. Default model: MiniMax-M2.7.

mmx text chat --message <text> [flags]
Flag Type Description
--message <text> string, required, repeatable Message text. Prefix with role: to set role (e.g. "system:You are helpful", "user:Hello")
--messages-file <path> string JSON file with messages array. Use - for stdin
--system <text> string System prompt
--model <model> string Model ID (default: MiniMax-M2.7)
--max-tokens <n> number Max tokens (default: 4096)
--temperature <n> number Sampling temperature (0.0, 1.0]
--top-p <n> number Nucleus sampling threshold
--stream boolean Stream tokens (default: on in TTY)
--tool <json-or-path> string, repeatable Tool definition JSON or file path
# Single message
mmx text chat --message "user:What is MiniMax?" --output json --quiet

# Multi-turn
mmx text chat \
  --system "You are a coding assistant." \
  --message "user:Write fizzbuzz in Python" \
  --output json

# From file
cat conversation.json | mmx text chat --messages-file - --output json

stdout: response text (text mode) or full response object (json mode).


image generate

Generate images. Model: image-01.

mmx image generate --prompt <text> [flags]
Flag Type Description
--prompt <text> string, required Image description
--aspect-ratio <ratio> string e.g. 16:9, 1:1
--n <count> number Number of images (default: 1)
--subject-ref <params> string Subject reference: type=character,image=path-or-url
--out-dir <dir> string Download images to directory
--out-prefix <prefix> string Filename prefix (default: image)
mmx image generate --prompt "A cat in a spacesuit" --output json --quiet
# stdout: image URLs (one per line in quiet mode)

mmx image generate --prompt "Logo" --n 3 --out-dir ./gen/ --quiet
# stdout: saved file paths (one per line)

video generate

Generate video. Default model: MiniMax-Hailuo-2.3. This is an async task — by default it polls until completion.

mmx video generate --prompt <text> [flags]
Flag Type Description
--prompt <text> string, required Video description
--model <model> string MiniMax-Hailuo-2.3 (default) or MiniMax-Hailuo-2.3-Fast
--first-frame <path-or-url> string First frame image
--callback-url <url> string Webhook URL for completion
--download <path> string Save video to specific file
--async boolean Return task ID immediately
--no-wait boolean Same as --async
--poll-interval <seconds> number Polling interval (default: 5)
# Non-blocking: get task ID
mmx video generate --prompt "A robot." --async --quiet
# stdout: {"taskId":"..."}

# Blocking: wait and get file path
mmx video generate --prompt "Ocean waves." --download ocean.mp4 --quiet
# stdout: ocean.mp4

video task get

Query status of a video generation task.

mmx video task get --task-id <id> [--output json]

video download

Download a completed video by task ID.

mmx video download --file-id <id> [--out <path>]

speech synthesize

Text-to-speech. Default model: speech-2.8-hd. Max 10k chars.

mmx speech synthesize --text <text> [flags]
Flag Type Description
--text <text> string Text to synthesize
--text-file <path> string Read text from file. Use - for stdin
--model <model> string speech-2.8-hd (default), speech-2.6, speech-02
--voice <id> string Voice ID (default: English_expressive_narrator)
--speed <n> number Speed multiplier
--volume <n> number Volume level
--pitch <n> number Pitch adjustment
--format <fmt> string Audio format (default: mp3)
--sample-rate <hz> number Sample rate (default: 32000)
--bitrate <bps> number Bitrate (default: 128000)
--channels <n> number Audio channels (default: 1)
--language <code> string Language boost
--subtitles boolean Include subtitle timing data
--pronunciation <from/to> string, repeatable Custom pronunciation
--sound-effect <effect> string Add sound effect
--out <path> string Save audio to file
--stream boolean Stream raw audio to stdout
mmx speech synthesize --text "Hello world" --out hello.mp3 --quiet
# stdout: hello.mp3

echo "Breaking news." | mmx speech synthesize --text-file - --out news.mp3

music generate

Generate music. Model: music-2.5. Responds well to rich, structured descriptions.

mmx music generate --prompt <text> [--lyrics <text>] [flags]
Flag Type Description
--prompt <text> string Music style description (can be detailed)
--lyrics <text> string Song lyrics with structure tags. Use "\u65e0\u6b4c\u8bcd" for instrumental. Cannot be used with --instrumental
--lyrics-file <path> string Read lyrics from file. Use - for stdin
--vocals <text> string Vocal style, e.g. "warm male baritone", "bright female soprano", "duet with harmonies"
--genre <text> string Music genre, e.g. folk, pop, jazz
--mood <text> string Mood or emotion, e.g. warm, melancholic, uplifting
--instruments <text> string Instruments to feature, e.g. "acoustic guitar, piano"
--tempo <text> string Tempo description, e.g. fast, slow, moderate
--bpm <number> number Exact tempo in beats per minute
--key <text> string Musical key, e.g. C major, A minor, G sharp
--avoid <text> string Elements to avoid in the generated music
--use-case <text> string Use case context, e.g. "background music for video", "theme song"
--structure <text> string Song structure, e.g. "verse-chorus-verse-bridge-chorus"
--references <text> string Reference tracks or artists, e.g. "similar to Ed Sheeran"
--extra <text> string Additional fine-grained requirements
--instrumental boolean Generate instrumental music (no vocals). Cannot be used with --lyrics or --lyrics-file
--aigc-watermark boolean Embed AI-generated content watermark
--format <fmt> string Audio format (default: mp3)
--sample-rate <hz> number Sample rate (default: 44100)
--bitrate <bps> number Bitrate (default: 256000)
--out <path> string Save audio to file
--stream boolean Stream raw audio to stdout

At least one of --prompt or --lyrics is required.

# Simple usage
mmx music generate --prompt "Upbeat pop" --lyrics "La la la..." --out song.mp3 --quiet

# Detailed prompt with vocal characteristics
mmx music generate --prompt "Warm morning folk" \
  --vocals "male and female duet, harmonies in chorus" \
  --instruments "acoustic guitar, piano" \
  --bpm 95 \
  --lyrics-file song.txt \
  --out duet.mp3

# Instrumental (use --instrumental flag)
mmx music generate --prompt "Cinematic orchestral, building tension" --instrumental --out bgm.mp3

vision describe

Image understanding via VLM. Provide either --image or --file-id, not both.

mmx vision describe (--image <path-or-url> | --file-id <id>) [flags]
Flag Type Description
--image <path-or-url> string Local path or URL (auto base64-encoded)
--file-id <id> string Pre-uploaded file ID (skips base64)
--prompt <text> string Question about the image (default: "Describe the image.")
mmx vision describe --image photo.jpg --prompt "What breed?" --output json

stdout: description text (text mode) or full response (json mode).


search query

Web search via MiniMax.

mmx search query --q <query>
Flag Type Description
--q <query> string, required Search query
mmx search query --q "MiniMax AI" --output json --quiet

quota show

Display Token Plan usage and remaining quotas.

mmx quota show [--output json]

Tool Schema Export

Export all commands as Anthropic/OpenAI-compatible JSON tool schemas:

# All tool-worthy commands (excludes auth/config/update)
mmx config export-schema

# Single command
mmx config export-schema --command "video generate"

Use this to dynamically register mmx commands as tools in your agent framework.


Exit Codes

Code Meaning
0 Success
1 General error
2 Usage error (bad flags, missing args)
3 Authentication error
4 Quota exceeded
5 Timeout
10 Content filter triggered

Piping Patterns

# stdout is always clean data — safe to pipe
mmx text chat --message "Hi" --output json | jq '.content'

# stderr has progress/spinners — discard if needed
mmx video generate --prompt "Waves" 2>/dev/null

# Chain: generate image → describe it
URL=$(mmx image generate --prompt "A sunset" --quiet)
mmx vision describe --image "$URL" --quiet

# Async video workflow
TASK=$(mmx video generate --prompt "A robot" --async --quiet | jq -r '.taskId')
mmx video task get --task-id "$TASK" --output json
mmx video download --task-id "$TASK" --out robot.mp4

Configuration Precedence

CLI flags → environment variables → ~/.mmx/config.json → defaults.

# Persistent config
mmx config set --key region --value cn
mmx config show

# Environment
export MINIMAX_API_KEY=sk-xxxxx
export MINIMAX_REGION=cn
Weekly Installs
822
Repository
minimax-ai/cli
First Seen
1 day ago
Installed on
opencode822
github-copilot822
codex821
gemini-cli820
kimi-cli820
cursor820