voice-output
Voice Output Skill
Convert text to speech and speak it aloud using system TTS or browser TTS.
Setup
No additional setup required. Uses built-in system commands:
- macOS:
saycommand (built-in) - Linux:
espeakorfestival(install viaapt install espeak) - Browser: Web Speech API via Chrome DevTools Protocol
Usage
Speak text via system TTS (say command)
{baseDir}/scripts/speak.sh "Hello, the job has completed successfully!"
Speak via browser TTS (requires Chrome with CDP)
{baseDir}/scripts/speak-browser.js "Hello from the browser!"
List available voices (browser)
{baseDir}/scripts/list-voices.js
Voice Options
System voices (macOS)
List available voices:
say -v "?"
Use a specific voice:
{baseDir}/scripts/speak.sh "Hello" --voice "Samantha"
Adjust speech rate:
{baseDir}/scripts/speak.sh "Hello" --rate 200
Browser voices
The browser TTS uses Web Speech API with available system voices. Default is usually the best available voice.
Trigger Patterns
Use this skill when:
- User asks "say", "speak", "read aloud", "text to speech"
- Job completes and you want to announce success/failure
- User wants notifications spoken rather than just displayed
- Accessibility - reading content aloud for visually impaired users
- Creating audio summaries or reports
Examples
Announce job completion
{baseDir}/scripts/speak.sh "Job complete! Processed 50 files, found 3 issues."
Read a summary
{baseDir}/scripts/speak.sh "Summary: Four files were modified, two tests added."
Browser TTS (when Chrome is available)
{baseDir}/scripts/speak-browser.js "Speaking directly through the browser!"
Notes
- System TTS (
say) works on macOS out of the box - On Linux, install espeak:
sudo apt install espeak - Browser TTS requires Chrome running with remote debugging (see browser-tools skill)
- Both methods are synchronous - the command blocks until speech completes
- For non-blocking speech, add
&at the end of the command
More from winsorllc/upgraded-carnival
vector-memory
Vector-based semantic memory using embeddings for intelligent recall. Store and search memories by meaning rather than keywords. Use when you need semantic search, similar document retrieval, or context-aware memory.
131model-router
Route requests between different LLM providers and models. Configure routing rules, fallback providers, and model-specific parameters inspired by ZeroClaw and OpenClaw model routing systems.
63rss-monitor
Monitor RSS/Atom feeds and blogs for new content using feedparser.
59rss-reader
Read and parse RSS/Atom feeds. Use when: user wants to subscribe to feeds, get latest articles, or monitor news sources.
54video-frames
Production-grade video frame extraction with thumbnail grids, GIF creation, and batch frame processing. Includes intelligent quality presets, progress tracking, and comprehensive error handling.
39elevenlabs-tts
Convert text to speech using ElevenLabs API. Use when you need to generate voice audio for messages, narrations, or accessibility.
25