youtube-transcribe
YouTube Transcript Extraction
Extract subtitles and transcripts from YouTube videos.
Methods
| Method | Tool | When to Use |
|---|---|---|
| CLI | yt-dlp | Fast, reliable, preferred |
| Browser | Chrome automation | Fallback when CLI fails |
| API | youtube-transcript-api | Python programmatic access |
yt-dlp Method (Preferred)
Basic Command
yt-dlp --write-auto-sub --write-sub --sub-lang en --skip-download -o "%(title)s.%(ext)s" "VIDEO_URL"
Key Flags
| Flag | Purpose |
|---|---|
--write-sub |
Download manual subtitles |
--write-auto-sub |
Download auto-generated subtitles |
--sub-lang LANG |
Specify language (en, zh-Hans, etc.) |
--skip-download |
Don't download video |
--cookies-from-browser chrome |
Use browser cookies for restricted videos |
Common Issues
| Issue | Solution |
|---|---|
| Sign-in required | Add --cookies-from-browser chrome |
| No subtitles found | Video has no captions available |
| Age-restricted | Use cookies from logged-in browser |
Browser Automation Fallback
When CLI fails, use browser automation:
- Open video page - Navigate to YouTube URL
- Expand description - Click "...more" button
- Open transcript - Click "Show transcript" button
- Extract text - Query DOM for transcript segments
DOM Selectors
| Element | Selector |
|---|---|
| Transcript segments | ytd-transcript-segment-renderer |
| Timestamp | .segment-timestamp |
| Text | .segment-text |
Output Formats
| Format | Extension | Use Case |
|---|---|---|
| VTT | .vtt | Web standard, includes timing |
| SRT | .srt | Video editing, media players |
| TXT | .txt | Plain text, no timing |
Convert VTT to Plain Text
# Strip timing and formatting
sed '/^[0-9]/d; /^$/d; /WEBVTT/d; /-->/d' video.vtt > video.txt
Language Codes
| Language | Code |
|---|---|
| English | en |
| Chinese (Simplified) | zh-Hans |
| Chinese (Traditional) | zh-Hant |
| Spanish | es |
| Multiple | en,es,zh-Hans |
Best Practices
| Practice | Why |
|---|---|
| Try manual subs first | Higher quality than auto-generated |
| Use cookies for restricted | Avoids sign-in errors |
| Check multiple languages | Some videos have better subs in other languages |
| Verify transcript exists | Not all videos have captions |
More from eyadsibai/ltk
document-processing
Use when working with "PDF", "Excel", "Word", "PowerPoint", "XLSX", "DOCX", "PPTX", "spreadsheets", "presentations", "extract text", "merge documents", "convert documents", or asking about "office document manipulation
879file-organization
Use when "organizing files", "cleaning up folders", "finding duplicates", "structuring directories", or asking about "Downloads cleanup", "folder structure", "file management
285literature-review
Use when "literature review", "research synthesis", "systematic review", "academic search", or asking about "find papers", "cite sources", "research gaps", "meta-analysis", "bibliography
188resume-generator
Use when "tailoring resume", "job application", "CV customization", "ATS optimization", or asking about "resume writing", "career transition", "job description matching
129content-writing
Use when "writing articles", "blog posts", "content creation", "research writing", "technical writing", or asking about "outlining", "citations", "improving hooks", "writing feedback
120stripe-payments
Use when implementing payment processing, Stripe integration, subscription billing, checkout flows, webhooks, or asking about "Stripe", "payments", "subscriptions", "checkout", "PCI compliance", "webhooks", "refunds
102