voice-ai-development
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTION
Full Analysis
- Indirect Prompt Injection (HIGH): The skill is specifically designed to ingest untrusted external data (user voice audio and transcripts) and pass it to LLMs that possess side-effect capabilities (tool/function calling).
- Ingestion points: Data enters through
wss://api.openai.com/v1/realtime(OpenAI),request.jsonin the/vapi/webhookroute (Vapi), andaudio_streamparameters (Deepgram/ElevenLabs). - Boundary markers: The provided code snippets do not include system prompt delimiters or 'ignore embedded instructions' warnings for the audio/text being processed.
- Capability inventory: The templates demonstrate tool-calling capabilities such as
get_weather(OpenAI Realtime) andcheck_order(Vapi webhook). Successful injection via user speech could lead to unauthorized tool execution. - Sanitization: No sanitization or validation logic is shown for the transcripts or parameters extracted from the voice interactions before they are used in business logic or tool calls.
- Data Exposure & Exfiltration (LOW): The code snippets include placeholders for API keys (
sk-...,api_key="..."). While these are not actual credentials, users must ensure they manage their secrets via environment variables rather than hardcoding them as shown in the examples.
Recommendations
- AI detected serious security threats
Audit Metadata