volc-audio-transcription

Pass

Audited by Gen Agent Trust Hub on Mar 25, 2026

Risk Level: SAFECOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/transcribe_local.py uses the subprocess module to programmatically launch a local HTTP server and the ngrok tunneling client to satisfy the API's requirement for a public audio URL.
  • Evidence: Call to subprocess.Popen starting python3 -m http.server in scripts/transcribe_local.py.
  • Evidence: Call to subprocess.Popen starting ngrok in scripts/transcribe_local.py.
  • [DATA_EXFILTRATION]: When using the local transcription feature, the skill exposes the entire contents of the directory containing the target audio file to the public internet for the duration of the transcription process.
  • Evidence: VolcTranscriberLocal.start_services in scripts/transcribe_local.py sets the HTTP server's working directory to the directory of the provided audio file and opens a public ngrok tunnel to that port.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection if an attacker-controlled audio file contains spoken instructions that the agent then processes as transcribed text.
  • Ingestion points: scripts/volc_transcriber.py (via audio_url) and scripts/transcribe_local.py (via local file path).
  • Boundary markers: Absent; the transcribed text is returned to the agent without delimiters or safety warnings.
  • Capability inventory: The skill converts audio to text and provides the output to the agent's context.
  • Sanitization: Absent; the raw transcription results are returned without filtering or validation.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 25, 2026, 05:32 AM