skills/whq25/rawgenai/agent-right-brain

agent-right-brain

SKILL.md

Agent Right Brain

Use rawgenai <provider> <action> to give agents creative abilities. Always read the chosen provider's reference file before running commands.

Prerequisites

brew install WHQ25/tap/rawgenai

Before using a provider, read its setup guide at references/setup/ to configure credentials.

Input Sources (All Capabilities)

  1. Positional argument: rawgenai <provider> <action> "text" [flags]
  2. File: rawgenai <provider> <action> --file input.txt [flags]
  3. Stdin: echo "text" | rawgenai <provider> <action> [flags]

General Guidelines

  • On first use of a capability, ask user to pick a provider. Remember for the session.
  • All output is JSON. Always show file paths to the user.
  • For async commands (video, some image/audio): create -> status -> download.
  • If a command fails, try a different provider or inform the user.
  • Write image/video prompts descriptively: subject + action + environment + style + lighting.
  • For TTS: write natural conversational text, not markdown. Use --speak for playback, -o for file.

Speak (TTS)

rawgenai <provider> tts "<text>" --speak

Provider Command Best For Reference
OpenAI rawgenai openai tts General purpose, English ref
Google Gemini rawgenai google tts Expressive storytelling, multi-speaker ref
ElevenLabs rawgenai elevenlabs tts Most natural voices, 70+ languages ref
Seed rawgenai seed tts Chinese, emotion-rich ref
DashScope rawgenai dashscope tts Chinese, 10 languages, 49 voices ref
MiniMax rawgenai minimax tts Chinese, streaming ref
Kling rawgenai kling tts Bilingual zh/en ref
Runway rawgenai runway audio tts Async

Listen (STT)

rawgenai <provider> stt <audio-file>

Provider Command Best For Reference
OpenAI rawgenai openai stt Subtitles (srt/vtt) ref
Google Gemini rawgenai google stt Speaker diarization ref
ElevenLabs rawgenai elevenlabs stt Large files (3GB), video input ref
DashScope rawgenai dashscope stt Chinese, emotion, long audio (12h async) ref

Image

rawgenai <provider> image "<prompt>" -o output.png

Provider Command Best For Reference
OpenAI rawgenai openai image Transparent bg, editing, multi-turn ref
Google Gemini rawgenai google image 4K, text in image ref
Grok rawgenai grok image Batch (up to 10) ref
Seed rawgenai seed image 4K, multi-image fusion ref
DashScope rawgenai dashscope image Text rendering, Chinese ref
MiniMax rawgenai minimax image Subject reference ref
Kling rawgenai kling image Face reference (async) ref
Luma rawgenai luma image Creative, reframe (async)
Hunyuan rawgenai hunyuan image Chinese (async)
Runway rawgenai runway image Cinematic (async)

Video

rawgenai <provider> video create "<prompt>" [flags]status <id>download <id> -o out.mp4

Provider Command Best For Reference
OpenAI (Sora) rawgenai openai video Remix ref
Google (Veo) rawgenai google video 4K, extension ref
Grok rawgenai grok video Quick, editing ref
Seed rawgenai seed video Audio, wide ratios ref
DashScope rawgenai dashscope video Character ref, multi-shot ref
MiniMax (Hailuo) rawgenai minimax video Subject ref, director modes ref
Kling rawgenai kling video Most advanced, element system ref
Luma rawgenai luma video Extension, upscale
Hunyuan rawgenai hunyuan video Chinese
Runway rawgenai runway video Cinematic, character ref

Music

Provider Command Best For Reference
ElevenLabs rawgenai elevenlabs music Prompt-based, composition plans ref
MiniMax rawgenai minimax music create Lyrics-to-music, Chinese ref

Sound Effects (SFX)

Provider Command Reference
ElevenLabs rawgenai elevenlabs sfx "<prompt>" -o out.mp3 ref
Runway rawgenai runway audio sfx "<prompt>"

Dialogue

Multi-speaker dialogue from JSON script (max 10 voices).

Provider Command Reference
ElevenLabs rawgenai elevenlabs dialogue -i script.json -o out.mp3 ref

Voice Management

Design, clone, and manage custom voices.

Provider Command Capabilities Reference
ElevenLabs rawgenai elevenlabs voice list, design, create, preview ref
Kling rawgenai kling voice create, status, list, delete ref
MiniMax rawgenai minimax voice list, upload, clone, design, delete ref
Seed rawgenai seed voice-clone upload, status, order, renew ref

Audio Processing

Async: rawgenai runway audio <action>status <id>download <id> -o out

Provider Command Capability
Runway rawgenai runway audio sts Speech-to-speech (voice conversion)
Runway rawgenai runway audio dubbing Dub audio to another language
Runway rawgenai runway audio isolation Isolate voice from background
Weekly Installs
3
Repository
whq25/rawgenai
GitHub Stars
2
First Seen
12 days ago
Installed on
opencode3
claude-code3
github-copilot3
codex3
kimi-cli3
gemini-cli3