speak-response
Speak Response
Vocalize text using local Qwen3-TTS. Default voice is the Oracle (cloned from a Dune narrator with deep, resonant, prophetic quality).
Quick Examples
| Command | Effect |
|---|---|
/speak |
Last 2 sentences with Oracle voice |
/speak 5 |
Last 5 sentences with Oracle voice |
/speak "The sleeper must awaken." |
Specific text with Oracle voice |
/speak --preset mood:warm |
Last 2 sentences with preset speaker + emotion |
/speak --preset "Hello" speaker:Vivian voice:"nurturing" |
Preset speaker with custom voice |
Default: Oracle Voice
The oracle voice is a deep, resonant, prophetic voice cloned from a Dune narrator. It speaks all text with a sense of ancient wisdom and gravitas.
# Default usage - Oracle voice
scripts/speak.sh "The spice must flow."
scripts/speak.sh "He who controls the spice controls the universe."
Limitation
The Oracle uses voice cloning (Base model), which does not support per-message instruction control. The voice characteristics are fixed. For emotion/mood control, use --preset.
Preset Speakers (--preset)
For emotion and mood control, use --preset to switch to CustomVoice with adjustable instructions:
scripts/speak.sh --preset "<text>" [speaker] [instruction]
Quick Preset Examples
# Calm therapeutic voice
scripts/speak.sh --preset "Take a deep breath." Vivian "calm, nurturing, gentle pace"
# Excited announcement
scripts/speak.sh --preset "We did it!" Ryan "joyful, excited, enthusiastic"
# Serious explanation
scripts/speak.sh --preset "This is important." Eric "serious, measured, emphatic"
Custom Voice Instructions
The model understands rich natural language descriptions:
| Aspect | Examples |
|---|---|
| Emotion | joyful, melancholic, anxious, calm, excited, contemplative |
| Pace | slow and deliberate, rapid and energetic, measured, hesitant |
| Intensity | soft and gentle, loud and commanding, whispered, emphatic |
| Style | warm and nurturing, professional, playful, dramatic |
| Prosody | with dramatic pauses, rising intonation, emphatic on key words |
Mood Presets (Shortcuts)
| Preset | Expands To |
|---|---|
calm |
"calm, soothing, gentle pace" |
warm |
"warm, empathetic, nurturing tone" |
excited |
"joyful, excited, enthusiastic" |
serious |
"serious, measured, authoritative" |
gentle |
"soft, gentle, whispered" |
encouraging |
"encouraging, uplifting, sincere" |
contemplative |
"thoughtful, slow pace, reflective" |
Speakers
| Speaker | Best For |
|---|---|
| Ryan (default) | Professional, serious, authoritative |
| Vivian | Warm, nurturing, therapeutic |
| Serena | Calm, gentle, contemplative |
| Dylan | Friendly, casual, playful |
| Eric | Serious, dramatic, commanding |
| Aiden | Encouraging, uplifting, energetic |
| Uncle_Fu | Wise, measured |
| Ono_Anna | Soft, gentle |
| Sohee | Clear, professional |
Workflow
- Parse arguments for text and mode (default oracle vs --preset)
- Extract text from last response if not provided
- Default mode: Clone with Oracle voice
- Preset mode: Generate with CustomVoice + instruction
- Audio plays through macOS speakers
Execution
# Oracle voice (default)
scripts/speak.sh "<text>"
# Preset speaker with instruction
scripts/speak.sh --preset "<text>" [speaker] [instruction]
Voice Cloning (Custom Voices)
Clone any voice from a 3+ second audio sample:
# Get transcript first (use Whisper API)
curl -s https://api.openai.com/v1/audio/transcriptions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-F file="@reference.mp3" -F model="whisper-1"
# Clone the voice
scripts/clone.sh "<text to speak>" "<audio_file>" "<transcript>"
Voice Design (Create New Voices)
Design entirely new voices from natural language descriptions:
scripts/design-voice.sh "<sample_text>" "<voice_description>"
# Example: Create a warm guide voice
scripts/design-voice.sh \
"Take a deep breath and feel this moment." \
"warm, nurturing, gentle pace, empathetic, female"
Then clone the designed voice for reuse:
scripts/clone.sh "New text" designed-voice.wav "Original sample text"
See references/moods.md for more instruction examples.