voice-ai-integration
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFEPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
- PROMPT_INJECTION (LOW): The skill is susceptible to indirect prompt injection. It transcribes user audio into text and appends it to the conversation history without any sanitization or protective delimiters.
- Ingestion points: The
process_voice_inputmethod inexamples/voice_assistant.pytakes anaudio_fileand converts it to text which is then used in the conversation pipeline. - Boundary markers: No markers or system instructions are used to separate transcribed user content from agent instructions in
VoiceAssistant.generate_response. - Capability inventory: The skill provides access to local audio hardware (microphone and speakers) via
pyaudioand interfaces with multiple cloud AI providers. - Sanitization: The transcription text is used raw as it comes from the STT provider.
- DATA_EXFILTRATION (LOW): The skill makes network requests to external API endpoints (AssemblyAI and Eleven Labs) to process audio data.
- Evidence:
requests.postcalls toapi.assemblyai.cominexamples/speech_recognition_providers.pyandapi.elevenlabs.ioinexamples/text_to_speech_providers.pytransmit audio data to third-party servers.
Audit Metadata