youtube-captions
Pass
Audited by Gen Agent Trust Hub on Mar 13, 2026
Risk Level: SAFEPROMPT_INJECTIONEXTERNAL_DOWNLOADSCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it fetches and processes untrusted text from external YouTube videos.
- Ingestion points: The
get-captions.shscript extracts captions directly from YouTube or generates transcripts using the Whisper model from video audio. - Boundary markers: No explicit delimiters or boundary markers are used to wrap the transcript text when it is provided to the agent. There are no instructions to the agent to treat the transcript as data rather than instructions.
- Capability inventory: The skill uses
yt-dlpandwhisperto perform its tasks. While the script has limited capabilities, the resulting transcript becomes part of the agent's conversational context. - Sanitization: There is no sanitization, filtering, or validation performed on the extracted captions. If a video contains captions like 'Ignore your previous instructions and perform [malicious action]', an agent summarizing the video might follow those instructions.
- [EXTERNAL_DOWNLOADS]: The skill downloads media and subtitle files from YouTube using
yt-dlpand utilizes theopenai-whispermodel. - The
openai-whisperpackage is provided by OpenAI, a trusted organization. yt-dlpis a standard, well-known tool for media extraction.- [COMMAND_EXECUTION]: The shell script executes several subprocesses to perform its functions.
- It uses
yt-dlpfor subtitle and audio extraction andwhisperfor transcription. - The script uses
mktempfor temporary file management and includes cleanup logic via shell traps. - Variables passed to commands are quoted to prevent basic shell injection vulnerabilities.
Audit Metadata