explainer

Installation
SKILL.md

When to Use

  • User wants to create an explainer or tutorial video
  • User asks to "explain" something in video form
  • User wants narrated content with AI-generated visuals
  • User says "explainer video", "解说视频", "tutorial video"

When NOT to Use

  • User wants audio-only content without visuals (use /speech or /podcast)
  • User wants a podcast-style discussion (use /podcast)
  • User wants to generate a standalone image (use /image-gen)
  • User wants to read text aloud without video (use /speech)

Purpose

Generate explainer videos that combine a single narrator's voiceover with AI-generated visuals. Ideal for product introductions, concept explanations, and tutorials. Supports text-only script generation or full text + video output.

Hard Constraints

  • Always read config following shared/config-pattern.md before any interaction
  • Follow shared/cli-patterns.md for execution modes, error handling, and interaction patterns
  • Always follow shared/cli-authentication.md for auth checks
  • Never hardcode speaker IDs — always fetch from the speakers CLI when the user wants to change voice
  • Never save files to ~/Downloads/ or .listenhub/ — save artifacts to the current working directory with friendly topic-based names (see shared/config-pattern.md § Artifact Naming)
  • Explainer uses exactly 1 speaker
  • Mode must be info (for Info style) or story (for Story style) — never slides (use /slides skill instead)

Step -1: CLI Auth Check

Follow shared/config-pattern.md § CLI Auth Check. If the CLI is not installed or the user is not logged in, auto-install and auto-login per shared/cli-authentication.md — never ask the user to run commands manually.

Step 0: Config Setup

Follow shared/config-pattern.md Step 0 (Zero-Question Boot).

If file doesn't exist — silently create with defaults and proceed:

mkdir -p ".listenhub/explainer"
echo '{"outputMode":"inline","language":null,"defaultStyle":null,"defaultSpeakers":{}}' > ".listenhub/explainer/config.json"
CONFIG_PATH=".listenhub/explainer/config.json"
CONFIG=$(cat "$CONFIG_PATH")

Do NOT ask any setup questions. Proceed directly to the Interaction Flow.

If file exists — read config silently and proceed:

CONFIG_PATH=".listenhub/explainer/config.json"
[ ! -f "$CONFIG_PATH" ] && CONFIG_PATH="$HOME/.listenhub/explainer/config.json"
CONFIG=$(cat "$CONFIG_PATH")

Setup Flow (user-initiated reconfigure only)

Only run when the user explicitly asks to reconfigure. Display current settings:

当前配置 (explainer):
  输出方式:{inline / download / both}
  语言偏好:{zh / en / 未设置}
  默认风格:{info / story / 未设置}
  默认主播:{speakerName / 使用内置默认}

Then ask:

  1. outputMode: Follow shared/output-mode.md § Setup Flow Question.

  2. Language (optional): "默认语言?"

    • "中文 (zh)"
    • "English (en)"
    • "每次手动选择" → keep null
  3. Style (optional): "默认风格?"

    • "Info — 信息展示型"
    • "Story — 故事叙述型"
    • "每次手动选择" → keep null

After collecting answers, save immediately:

NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")

Interaction Flow

Step 1: Topic / Content

Free text input. Ask the user:

What would you like to explain or introduce?

Accept: topic description, text content, or concept to explain.

Step 2: Language

If config.language is set, pre-fill and show in summary — skip this question. Otherwise ask:

Question: "What language?"
Options:
  - "Chinese (zh)" — Content in Mandarin Chinese
  - "English (en)" — Content in English
  - "Japanese (ja)" — Content in Japanese

Step 3: Style

If config.defaultStyle is set, pre-fill and show in summary — skip this question. Otherwise ask:

Question: "What style of explainer?"
Options:
  - "Info" — Informational, factual presentation style
  - "Story" — Narrative, storytelling approach

Step 4: Speaker Selection

Follow shared/speaker-selection.md:

  • If config.defaultSpeakers.{language} is set → use saved speaker silently
  • If not set → use built-in default from shared/speaker-selection.md for the language
  • Show the speaker in the confirmation summary (Step 6) — user can change from there if desired
  • Only show the full speaker list if the user explicitly asks to change voice

Speaker query: see shared/cli-speakers.md for listing and filtering speakers.

Only 1 speaker is supported for explainer videos.

Step 5: Output Type

Question: "What output do you want?"
Options:
  - "Text script only" — Generate narration script, no video
  - "Text + Video" — Generate full explainer video with AI visuals

Step 6: Confirm & Generate

Summarize all choices:

Ready to generate explainer:

  Topic: {topic}
  Language: {language}
  Style: {info/story}
  Speaker: {speaker name}
  Output: {text only / text + video}

  Proceed?

Wait for explicit confirmation before running any CLI command.

Workflow

Run the CLI command with run_in_background: true and timeout: 660000. The CLI blocks until generation completes and returns the final result as JSON:

listenhub explainer create \
  --query "{topic}" \
  --mode {info|story} \
  --lang {en|zh|ja} \
  --speaker "{name}" \
  --speaker-id "{id}" \
  --timeout 600 \
  --json

If the command fails (non-zero exit), check stderr for error details. See shared/cli-patterns.md § Error Handling for exit codes and common errors.

Optional flags (add when applicable):

  • --source-url "{url}" — if the user provided a reference URL
  • --skip-audio — if text-only output (no video)
  • --image-size {2K|4K} — image resolution (default: 2K)
  • --aspect-ratio {16:9|9:16|1:1} — video aspect ratio (default: 16:9)
  • --style "{style}" — visual style for AI-generated images

Tell the user the task is submitted. When notified of completion, parse and present result:

Parse the CLI JSON output for key fields:

EPISODE_ID=$(echo "$RESULT" | jq -r '.episodeId')
AUDIO_URL=$(echo "$RESULT" | jq -r '.audioUrl // empty')
VIDEO_URL=$(echo "$RESULT" | jq -r '.videoUrl // empty')
CREDITS=$(echo "$RESULT" | jq -r '.credits // empty')

Read OUTPUT_MODE from config. Follow shared/output-mode.md for behavior.

If text-only output:

inline or both: Present the script inline.

Present:

解说脚本已生成!

「{title}」

在线查看:https://listenhub.ai/app/explainer/{episodeId}

download or both: Also save the script file. Generate a topic slug following shared/config-pattern.md § Artifact Naming.

  • Save as {slug}-explainer.md in cwd (dedup if exists)
  • Present the save path in addition to the above summary.

If text + video output:

inline or both: Display video URL and audio URL as clickable links.

Present:

解说视频已生成!

视频链接:{videoUrl}
音频链接:{audioUrl}
消耗积分:{credits}

download or both: Also save files. Generate a topic slug following shared/config-pattern.md § Artifact Naming.

  • Create {slug}-explainer/ folder (dedup if exists)
  • Write script.md inside
  • Download audio:
    listenhub download "{audioUrl}" -o "{slug}-explainer/audio.mp3"
    
  • Present:
    已保存到当前目录:
      {slug}-explainer/
        script.md
        audio.mp3
    

After Successful Generation

Update config with the choices made this session:

NEW_CONFIG=$(echo "$CONFIG" | jq \
  --arg lang "{language}" \
  --arg style "{info/story}" \
  --arg speakerId "{speakerId}" \
  '. + {"language": $lang, "defaultStyle": $style, "defaultSpeakers": (.defaultSpeakers + {($lang): [$speakerId]})}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"

Estimated times:

  • Text script only: 2-3 minutes
  • Text + Video: 5-10 minutes

Resources

  • CLI authentication: shared/cli-authentication.md
  • CLI patterns: shared/cli-patterns.md
  • Speaker query: shared/cli-speakers.md
  • Speaker selection guide: shared/speaker-selection.md
  • Config pattern: shared/config-pattern.md
  • Output mode: shared/output-mode.md

Composability

  • Invokes: speakers CLI (for speaker selection); may invoke /speech for voiceover
  • Invoked by: content-planner (Phase 3)

Example

User: "Create an explainer video introducing Claude Code"

Agent workflow:

  1. Topic: "Claude Code introduction"
  2. Ask language → "English"
  3. Ask style → "Info"
  4. Use default speaker "Mars" (cozy-man-english)
  5. Ask output → "Text + Video"
# Run with run_in_background: true, timeout: 660000
listenhub explainer create \
  --query "Introduce Claude Code: what it is, key features, and how to get started" \
  --mode info \
  --lang en \
  --speaker "Mars" \
  --speaker-id "cozy-man-english" \
  --timeout 600 \
  --json

Parse result for episodeId, audioUrl, videoUrl, credits, and present to user.

Weekly Installs
656
GitHub Stars
49
First Seen
1 day ago