youtube-content

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION] (HIGH): Indirect Prompt Injection via YouTube transcripts.
  • Ingestion points: scripts/fetch_youtube.py fetches the full text of YouTube transcripts and video descriptions, which are attacker-controlled sources.
  • Boundary markers: There are no boundary markers or delimiters used when passing the transcript content to the agent. The instructions in SKILL.md and references/analysis-modes.md simply tell the agent to 'apply requested analysis mode to transcript'.
  • Capability inventory: The skill has the capability to write files to the local filesystem via scripts/save_analysis.py, which targets ~/.claude/knowledge.
  • Sanitization: No sanitization or filtering is performed on the transcript text before it is processed by the agent.
  • Dangerous Instructions: In references/analysis-modes.md, the 'Custom Analysis' section explicitly tells the agent: 'apply their instructions directly to the transcript content'. This instructs the agent to treat data as instructions, a primary vector for jailbreaking and malicious redirection.
  • [COMMAND_EXECUTION] (LOW): The skill executes local Python scripts using uv run and subprocess.run (in test_fetch.py). While these are legitimate for the skill's operation, they provide the necessary plumbing for a successful prompt injection to achieve side effects on the host system.
  • [EXTERNAL_DOWNLOADS] (INFO): Uses yt-dlp and youtube-transcript-api to fetch data from YouTube. These are trusted libraries, but the data they retrieve is untrusted and provides the attack surface for injection.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 12:34 AM