realtime-audio-architecture
Warn
Audited by Snyk on Mar 12, 2026
Risk Level: MEDIUM
Full Analysis
MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).
- Third-party content exposure detected (high risk: 0.90). The skill's SKILL.md describes a centralized audio server that accepts POST /v1/audio/speak from external clients (e.g., a Telegram bot and other thin HTTP callers), ingesting arbitrary user-generated text and parameters (voice/lang/speed) which the agent synthesizes and plays — i.e., untrusted third-party content is read at runtime and can change the agent's behavior.
Issues (1)
W011
MEDIUMThird-party content exposure detected (indirect prompt injection risk).
Audit Metadata