volc-audio-transcription
Pass
Audited by Gen Agent Trust Hub on Mar 25, 2026
Risk Level: SAFECOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The script
scripts/transcribe_local.pyuses thesubprocessmodule to programmatically launch a local HTTP server and thengroktunneling client to satisfy the API's requirement for a public audio URL. - Evidence: Call to
subprocess.Popenstartingpython3 -m http.serverinscripts/transcribe_local.py. - Evidence: Call to
subprocess.Popenstartingngrokinscripts/transcribe_local.py. - [DATA_EXFILTRATION]: When using the local transcription feature, the skill exposes the entire contents of the directory containing the target audio file to the public internet for the duration of the transcription process.
- Evidence:
VolcTranscriberLocal.start_servicesinscripts/transcribe_local.pysets the HTTP server's working directory to the directory of the provided audio file and opens a publicngroktunnel to that port. - [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection if an attacker-controlled audio file contains spoken instructions that the agent then processes as transcribed text.
- Ingestion points:
scripts/volc_transcriber.py(viaaudio_url) andscripts/transcribe_local.py(via local file path). - Boundary markers: Absent; the transcribed text is returned to the agent without delimiters or safety warnings.
- Capability inventory: The skill converts audio to text and provides the output to the agent's context.
- Sanitization: Absent; the raw transcription results are returned without filtering or validation.
Audit Metadata