voice-ai

Installation
SKILL.md
  1. STT - Deepgram Nova-3 streaming transcription (~150ms)
  2. LLM - Groq llama-3.1-8b-instant for fastest inference (~220ms)
  3. TTS - Cartesia Sonic for ultra-realistic voice (~90ms)
  4. Telephony - Twilio Media Streams for real-time bidirectional audio

CRITICAL: NO OPENAI - Never use from openai import OpenAI

Key deliverables:

  • Streaming STT with voice activity detection
  • Low-latency LLM responses optimized for voice
  • Expressive TTS with emotion controls
  • Twilio Media Streams WebSocket handler
Installs
92
GitHub Stars
23
First Seen
Jan 22, 2026
voice-ai — scientiacapital/skills