NYC

text-to-speech

Fail

Audited by Gen Agent Trust Hub on Feb 15, 2026

Risk Level: HIGHSAFE
Full Analysis
  • [Indirect Prompt Injection] (LOW): The skill's primary purpose is processing external text for synthesis, creating a natural ingestion point for untrusted data.
  • Ingestion points: SSMLProcessor.process(ssml) and SentenceProcessor.synthesize_natural(text) take external strings.
  • Boundary markers: None present in the example code.
  • Capability inventory: The skill writes to the filesystem (_save_audio, AudioConverter.convert) and likely invokes ffmpeg via the pydub library.
  • Sanitization: While the TTSContentFilter masks PII (passwords, keys), it does not sanitize for embedded instructions that could influence the agent's behavior.
  • [Resource Exhaustion] (LOW): The RateLimitedTTS implementation is a proactive security measure to prevent denial-of-service (DoS) via synthesis abuse.
  • [Insecure Functionality] (LOW): The SecureAudioOutput class uses tempfile.mktemp(). This function is deprecated and insecure as it is vulnerable to race conditions where a malicious actor could create a file at the returned path before the application does. tempfile.mkstemp() or NamedTemporaryFile should be used instead.
  • [False Positive] (INFO): The automated scanner flagged logger.info as a malicious URL. This is a false positive; the string is a standard Python logging method call and not a network resource.
Recommendations
  • Contains 1 malicious URL(s) - DO NOT USE
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 15, 2026, 10:40 PM