realtime-audio-architecture

Warn

Audited by Snyk on Mar 12, 2026

Risk Level: MEDIUM
Full Analysis

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

  • Third-party content exposure detected (high risk: 0.90). The skill's SKILL.md describes a centralized audio server that accepts POST /v1/audio/speak from external clients (e.g., a Telegram bot and other thin HTTP callers), ingesting arbitrary user-generated text and parameters (voice/lang/speed) which the agent synthesizes and plays — i.e., untrusted third-party content is read at runtime and can change the agent's behavior.

Issues (1)

W011
MEDIUM

Third-party content exposure detected (indirect prompt injection risk).

Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 12, 2026, 01:06 AM
Issues
1