text-to-speech
Pass
Audited by Gen Agent Trust Hub on Mar 6, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The
references/streaming.mdfile contains a Python code example that utilizessubprocess.Popento invoke theffplayutility. This is a legitimate use case for real-time audio playback. The command is executed using a list of arguments (['ffplay', '-nodisp', '-autoexit', '-']), which is a secure practice that prevents shell injection. - [EXTERNAL_DOWNLOADS]: The skill directs users to download and install official packages from trusted sources, specifically the
elevenlabsPython package and the@elevenlabs/elevenlabs-jsNode.js package. It also establishes connections to well-known ElevenLabs service domains (api.elevenlabs.io) for API requests and WebSocket streaming. - [PROMPT_INJECTION]: An indirect prompt injection surface (Category 8) exists because the skill processes arbitrary text input for speech synthesis.
- Ingestion points: The
textparameter inSKILL.mdandreferences/streaming.mdexamples. - Boundary markers: No delimiters or 'ignore' instructions are used to separate user-provided text from agent instructions.
- Capability inventory: Network requests to ElevenLabs API, file system writes (MP3 output), and subprocess execution (ffplay).
- Sanitization: No input sanitization or validation is applied to the text content before it is sent to the synthesis API. Despite these factors, the risk is considered low as this behavior is inherent to the functional purpose of a text-to-speech tool.
Audit Metadata