The Agent Skills Directory

[PROMPT_INJECTION]: The skill implements strong defensive instructions to prevent indirect prompt injection. It explicitly warns the agent to treat transcription output as untrusted external data, to wrap excerpts in clearly labeled quote blocks, and to never follow any commands or directives found within the transcribed text.
[EXTERNAL_DOWNLOADS]: The skill downloads and installs the 'yt-dlp' package from PyPI to facilitate YouTube audio extraction. It also performs a version check by fetching a JSON file from the vendor's GitHub repository. Both sources are well-known and the actions are necessary for the skill's stated functionality.
[DATA_EXFILTRATION]: Audio and video data, along with the user-provided API key, are transmitted to the vendor's domain (agents.text-ops-subs.com) for processing. This is consistent with the skill's primary purpose and uses the official vendor infrastructure.
[COMMAND_EXECUTION]: Subprocess calls are used to execute helper scripts and external utilities like 'ffprobe' for duration estimation and 'yt-dlp' for downloading audio. These calls use static paths or well-defined arguments related to the skill's operation.
[INDIRECT_PROMPT_INJECTION]:
Ingestion points: Transcribed text from external audio sources is saved to local files and may be displayed to the user.
Boundary markers: The skill mandates the use of delimiters and labels like '[מתוך התמלול]: "..."' to separate untrusted content from agent dialogue.
Capability inventory: The skill has file system access (read/write results) and shell execution capabilities (via subprocess for tools like yt-dlp).
Sanitization: The skill contains explicit instructions for the agent to never interpret or act upon instructions found within the transcribed data.

transcription-speech-to-text-hebrew