youtube-captions
youtube-captions
Extract timestamped captions from YouTube videos. Output is VTT format (timestamps preserved).
Prerequisites
yt-dlp— REQUIRED (brew install yt-dlp)openai-whisper— REQUIRED only if video has no subtitles (pip install openai-whisper)- First run downloads the
smallmodel (~500MB) - Transcription is significantly slower than subtitle download
- First run downloads the
Usage
bash scripts/get-captions.sh <youtube-url> [language]
youtube-url— any valid YouTube video URLlanguage— subtitle language code (default:en)
Output goes to stdout. Status messages go to stderr.
Fallback Chain
- Manual subtitles — human-uploaded captions (fastest, most accurate)
- Auto-generated subtitles — YouTube's speech recognition
- Whisper transcription — downloads audio, transcribes locally with
whisper --model small
The script tries each step in order and exits on the first success.
Example
# Get English captions
bash scripts/get-captions.sh "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
# Get Spanish captions
bash scripts/get-captions.sh "https://www.youtube.com/watch?v=dQw4w9WgXcQ" es
Output Format
VTT (Web Video Text Tracks) with timestamps:
WEBVTT
00:00:01.000 --> 00:00:04.000
First line of dialogue
00:00:04.000 --> 00:00:08.000
Second line of dialogue
Notes
- All temporary files (audio, intermediate subtitle files) are cleaned up automatically
- If neither yt-dlp subtitles nor Whisper are available, the script exits with an error and clear instructions
- Long videos with no subtitles will take time to transcribe — Whisper processes roughly at 1x realtime on CPU
More from third774/dotfiles
opensrc
Fetch source code for npm, PyPI, or crates.io packages and GitHub/GitLab repos to provide AI agents with implementation context beyond types and docs. Use when needing to understand how a library works internally, debug dependency issues, or explore package implementations.
90natural-writing
Write like a human, not a language model. Avoid AI-tell vocabulary, formulaic structures, and hollow emphasis. Apply to ALL written output including prose, documentation, comments, and communication. Use when drafting prose, documentation, comments, or any written output that should sound human.
66agent-skills
Author and improve Agent Skills following the agentskills.io specification. Use when creating new SKILL.md files, modifying existing skills, reviewing skill quality, or organizing skill directories with proper naming, descriptions, and progressive disclosure.
31documenting-code-comments
Standards for writing self-documenting code and best practices for when to write (and avoid) code comments. Use when auditing, cleaning up, or improving inline code documentation.
28customizing-opencode
Configure OpenCode via opencode.json, agents, commands, MCP servers, custom tools, plugins, themes, keybinds, and permissions. Use when setting up or modifying OpenCode configuration.
23adversarial-code-review
Review code through hostile perspectives to find bugs, security issues, and unintended consequences the author missed. Use when reviewing PRs, auditing codebases, or before critical deployments.
21