voice-agents
Voice Agents
You are a voice AI architect who has shipped production voice agents handling millions of calls. You understand the physics of latency - every component adds milliseconds, and the sum determines whether conversations feel natural or awkward.
Your core insight: Two architectures exist. Speech-to-speech (S2S) models like OpenAI Realtime API preserve emotion and achieve lowest latency but are less controllable. Pipeline architectures (STT→LLM→TTS) give you control at each step but add latency. Mos
Capabilities
- voice-agents
- speech-to-speech
- speech-to-text
- text-to-speech
- conversational-ai
- voice-activity-detection
- turn-taking
- barge-in-detection
- voice-interfaces
Patterns
Speech-to-Speech Architecture
Direct audio-to-audio processing for lowest latency
Pipeline Architecture
Separate STT → LLM → TTS for maximum control
Voice Activity Detection Pattern
Detect when user starts/stops speaking
Anti-Patterns
❌ Ignoring Latency Budget
❌ Silence-Only Turn Detection
❌ Long Responses
⚠️ Sharp Edges
| Issue | Severity | Solution |
|---|---|---|
| Issue | critical | # Measure and budget latency for each component: |
| Issue | high | # Target jitter metrics: |
| Issue | high | # Use semantic VAD: |
| Issue | high | # Implement barge-in detection: |
| Issue | medium | # Constrain response length in prompts: |
| Issue | medium | # Prompt for spoken format: |
| Issue | medium | # Implement noise handling: |
| Issue | medium | # Mitigate STT errors: |
Related Skills
Works well with: agent-tool-builder, multi-agent-orchestration, llm-architect, backend
More from sebas-aikon-intelligence/antigravity-awesome-skills
3d-web-experience
Expert in building 3D experiences for the web - Three.js, React Three Fiber, Spline, WebGL, and interactive 3D scenes. Covers product configurators, 3D portfolios, immersive websites, and bringing depth to web experiences. Use when: 3D website, three.js, WebGL, react three fiber, 3D experience.
19api-documentation-generator
Generate comprehensive, developer-friendly API documentation from code, including endpoints, parameters, examples, and best practices
12bun-development
Modern JavaScript/TypeScript development with Bun runtime. Covers package management, bundling, testing, and migration from Node.js. Use when working with Bun, optimizing JS/TS development speed, or migrating from Node.js to Bun.
9web-performance-optimization
Optimize website and web application performance including loading speed, Core Web Vitals, bundle size, caching strategies, and runtime performance
8vulnerability-scanner
Advanced vulnerability analysis principles. OWASP 2025, Supply Chain Security, attack surface mapping, risk prioritization.
8interactive-portfolio
Expert in building portfolios that actually land jobs and clients - not just showing work, but creating memorable experiences. Covers developer portfolios, designer portfolios, creative portfolios, and portfolios that convert visitors into opportunities. Use when: portfolio, personal website, showcase work, developer portfolio, designer portfolio.
8