transcribe
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [Indirect Prompt Injection] (HIGH): The script
scripts/transcribe_diarize.pyaccepts arbitrary file paths via theaudioand--known-speakerarguments without path validation or sandboxing. - Ingestion points: The
audiopositional argument and the--known-speakerflag inscripts/transcribe_diarize.pytake user-provided strings directly into file system operations. - Boundary markers: Absent. There are no delimiters or instructions to prevent the agent from being coerced into reading non-audio files.
- Capability inventory: The script performs file reads (
Path.read_bytes,open("rb")), network transmissions (OpenAI API call), and file writes (Path.write_text). - Sanitization: Absent. The script relies on
mimetypes.guess_typeand existence checks, which do not prevent reading sensitive text-based files if they are given an audio-like extension or if the API accepts the raw bytes. - [Data Exfiltration] (HIGH): Sensitive local file content can be exfiltrated to the OpenAI API service if an attacker influences the agent's input parameters.
- Evidence: The function
_encode_data_urlinscripts/transcribe_diarize.pyreads the entire content of a file provided via the--known-speakerargument and base64 encodes it into the API request payload.
Recommendations
- AI detected serious security threats
Audit Metadata