youtube-transcript
SKILL.md
YouTube Transcript Skill
Production-grade YouTube transcript extraction with comprehensive format support, intelligent caching, and resilient networking.
When to Use
✅ USE this skill when:
- Extracting transcripts from YouTube videos
- Converting YouTube captions to SRT/VTT subtitle files
- Analyzing video content via transcripts
- Creating subtitles for downloaded videos
- Batch processing multiple video transcripts
- Needing transcripts in specific languages
- Processing auto-generated captions
❌ DON'T use this skill when:
- Transcript not available (disabled by creator)
- Video is private or age-restricted
- Livestream that hasn't ended
- Need speech-to-text from audio → Use transcribe
- Need video frames → Use video-frames
Prerequisites
# Requires Node.js (already available)
node --version
# No additional dependencies required
Commands
Basic Usage
# Extract transcript with video ID
{baseDir}/youtube-transcript.js VIDEO_ID
# Extract with full URL
{baseDir}/youtube-transcript.js "https://www.youtube.com/watch?v=VIDEO_ID"
# Extract with short URL
{baseDir}/youtube-transcript.js "https://youtu.be/VIDEO_ID"
Output Formats
# Plain text with timestamps (default)
{baseDir}/youtube-transcript.js VIDEO_ID --format text
[0:00:00.00] Here is the transcript text
[0:00:05.32] More transcript content
# Plain text without timestamps
{baseDir}/youtube-transcript.js VIDEO_ID --format plain
Here is the transcript text More transcript content
# JSON with metadata
{baseDir}/youtube-transcript.js VIDEO_ID --format json
{
"title": "Video Title",
"author": "Channel Name",
"language": "en",
"isAutoGenerated": false,
"transcript": [...]
}
# SRT subtitle format
{baseDir}/youtube-transcript.js VIDEO_ID --format srt > video.srt
1
00:00:00,000 --> 00:00:05,320
Here is the transcript text
2
00:00:05,320 --> 00:00:08,150
More transcript content
# VTT subtitle format
{baseDir}/youtube-transcript.js VIDEO_ID --format vtt > video.vtt
WEBVTT
1
00:00.000 --> 00:05.320
Here is the transcript text
# TSV tab-separated values
{baseDir}/youtube-transcript.js VIDEO_ID --format tsv
start\tduration\ttext
0.000\t5.320\tHere is the transcript text
# CSV comma-separated values
{baseDir}/youtube-transcript.js VIDEO_ID --format csv
start,duration,text
0.000,5.320,"Here is the transcript text"
Language Selection
# Auto-select best available (default)
{baseDir}/youtube-transcript.js VIDEO_ID
# Specific language by code
{baseDir}/youtube-transcript.js VIDEO_ID --language en
{baseDir}/youtube-transcript.js VIDEO_ID --language es
{baseDir}/youtube-transcript.js VIDEO_ID --language fr
# Partial matches work too
{baseDir}/youtube-transcript.js VIDEO_ID --language zh # Matches zh-CN, zh-TW, etc.
# Language with auto-generated preference
{baseDir}/youtube-transcript.js VIDEO_ID --language ja --format srt
Common Language Codes:
| Code | Language |
|---|---|
| en | English |
| es | Spanish |
| fr | French |
| de | German |
| ja | Japanese |
| ko | Korean |
| zh | Chinese |
| pt | Portuguese |
| ru | Russian |
| hi | Hindi |
| ar | Arabic |
| it | Italian |
Save to File
# Save transcript directly to file
{baseDir}/youtube-transcript.js VIDEO_ID --output transcript.txt
{baseDir}/youtube-transcript.js VIDEO_ID --format srt --output subtitles.srt
{baseDir}/youtube-transcript.js VIDEO_ID --format json --output data.json
# Shell redirection (equivalent)
{baseDir}/youtube-transcript.js VIDEO_ID --format vtt > captions.vtt
Advanced Options
# Skip cache (force fresh fetch)
{baseDir}/youtube-transcript.js VIDEO_ID --no-cache
# Verbose debugging output
DEBUG=1 {baseDir}/youtube-transcript.js VIDEO_ID
# Combine options
{baseDir}/youtube-transcript.js VIDEO_ID --language es --format srt --output spanish.srt --no-cache
Features
Format Comparison
| Format | Use Case | Human Readable | Machine Readable |
|---|---|---|---|
text |
Default viewing | ✅ | ⚠️ |
plain |
Content only | ✅ | ⚠️ |
json |
API integration | ⚠️ | ✅ |
srt |
Subtitle files | ✅ | ✅ |
vtt |
Web captions | ✅ | ✅ |
tsv |
Spreadsheet import | ⚠️ | ✅ |
csv |
Database import | ⚠️ | ✅ |
Supported Video URL Formats
# Plain video ID (11 characters)
EBw7gsDPAYQ
# Standard YouTube URL
https://www.youtube.com/watch?v=EBw7gsDPAYQ
# Short youtu.be URL
https://youtu.be/EBw7gsDPAYQ
# Embed URL
https://www.youtube.com/embed/EBw7gsDPAYQ
# YouTube Live URL
https://www.youtube.com/live/EBw7gsDPAYQ
# URLs with additional parameters (automatically handled)
https://www.youtube.com/watch?v=EBw7gsDPAYQ&t=120s
https://www.youtube.com/watch?v=EBw7gsDPAYQ&index=2
# Playlist URLs (extracts first video)
https://www.youtube.com/watch?v=EBw7gsDPAYQ&list=...
Intelligent Caching
The skill implements intelligent caching to improve performance:
- Cache Location:
/tmp/youtube-transcript-cache/ - TTL: 24 hours per entry
- Max Entries: 100 videos
- Benefits:
- Instant retrieval of previously fetched transcripts
- Reduced load on YouTube servers
- Better performance for repeated operations
Cache Bypass:
# Force fresh fetch
{baseDir}/youtube-transcript.js VIDEO_ID --no-cache
Rate Limiting
To avoid being blocked by YouTube:
- Max 60 requests per minute
- Minimum 1 second delay between requests
- Exponential backoff on retries
Retry Logic
When requests fail:
- First attempt
- Wait 2 seconds, retry
- Wait 4 seconds, retry
- Wait 6 seconds, retry
- Final error reported
Error Handling
Error Codes
| Code | Name | Description | Resolution |
|---|---|---|---|
| 0 | SUCCESS | Transcript fetched | None needed |
| 1 | INVALID_VIDEO_ID | Bad URL/ID format double-check the video ID | |
| 2 | VIDEO_NOT_FOUND | Video doesn't exist | Verify video exists |
| 3 | TRANSCRIPT_DISABLED | Creator disabled captions | Contact creator |
| 4 | NO_TRANSCRIPT | No captions available | Wait for transcript |
| 5 | VIDEO_UNAVAILABLE | Can't access | Check restrictions |
| 6 | PRIVATE_VIDEO | Video is private | Get access/permission |
| 7 | RATE_LIMITED | Too many requests | Wait before retry |
| 8 | NETWORK_ERROR | Connection issue | Check internet |
| 9 | PARSE_ERROR | Data extraction failed | Try again |
| 99 | UNKNOWN | Unexpected error | Report issue |
Common Errors and Solutions
"Could not extract player data"
- YouTube may have changed their page structure
- The video may be age-restricted
- The video may require login
- Solution: Try again later or check if video is publicly accessible
"No captions available for this video"
- Creator hasn't added captions
- Auto-generated captions aren't ready (may take a few hours after upload)
- Video is too new
- Solution: Wait for YouTube to generate captions, or check if manual captions exist
"Rate limited by YouTube"
- Too many requests in short period
- Solution: Wait 1-2 minutes before retrying
"Transcript too long"
- Video exceeds 500K characters
- Solution: Use
--format jsonwhich handles large transcripts better
"Video unavailable or not found"
- Video removed or never existed
- Region-restricted
- Solution: Verify video ID/URL is correct
Technical Architecture
Data Flow
Video ID/URL
↓
Extract Video ID ← URL parser (7+ formats)
↓
Check Cache ← 24hr TTL store
↓[cache miss]
Fetch YouTube Page ← HTTP with retry logic
↓
Extract Player Data ← ytInitialPlayerResponse
↓
Parse Caption Tracks ← Language selection
↓
Fetch Transcript ← Select appropriate URL
↓
Parse Entries ← XML/JSON parsing
↓
Format Output ← 7 output formats
↓
Cache & Return ← Store for 24hr
Player Data Extraction
Extracts multiple potential sources:
ytInitialPlayerResponseJavaScript variableplayerResponseJSON in script tags- Caption tracks from various locations
Transcript Parsing
Supports multiple formats:
- JSON API Response: Modern format
- Timed Text XML: Legacy format
- Alternative XML: Older structure
- Special handling for: Auto-generated vs manual captions
Data Unescaping
Properly handles:
&→&<→<>→>"→"'/'/'→'- Whitespace normalization
Sample Output
JSON Format (Full)
{
"title": "How Artificial Intelligence Works",
"author": "Example Channel",
"duration": "PT10M32S",
"language": "en",
"isAutoGenerated": true,
"transcript": [
{
"start": 0.000,
"duration": 5.320,
"text": "In this video, we'll explore how AI systems learn and adapt"
},
{
"start": 5.320,
"duration": 4.180,
"text": "to perform tasks that traditionally required human intelligence"
}
],
"word_count": 2847,
"total_entries": 156
}
SRT Format (SubRip)
1
00:00:00,000 --> 00:00:05,320
In this video, we'll explore how AI systems
learn and adapt
2
00:00:05,320 --> 00:00:09,500
to perform tasks that traditionally
required human intelligence
3
00:00:09,500 --> 00:00:13,240
This process is called
machine learning
...
VTT Format (WebVTT)
WEBVTT
1
00:00.000 --> 00:05.320
In this video, we'll explore how AI systems
learn and adapt
2
00:05.320 --> 00:09.500
to perform tasks that traditionally
required human intelligence
...
Examples
Download Transcripts for Playlist
#!/bin/bash
# Process multiple videos from IDs file
for video_id in $(cat video_ids.txt); do
echo "Processing: $video_id"
{baseDir}/youtube-transcript.js "$video_id" --format srt --output "transcripts/${video_id}.srt" 2>/dev/null
if [ $? -eq 0 ]; then
echo " ✓ Success"
else
echo " ✗ Failed"
fi
# Sleep to respect rate limits
sleep 2
done
Convert to PDF for Reading
#!/bin/bash
VIDEO_ID="EBw7gsDPAYQ"
# Get transcript
{baseDir}/youtube-transcript.js "$VIDEO_ID" --format plain > transcript.txt
# Convert to PDF (requires pandoc)
pandoc transcript.txt -o transcript.pdf
echo "PDF created: transcript.pdf"
Analyze Word Counts
#!/bin/bash
VIDEO_ID="EBw7gsDPAYQ"
# Get JSON format
{baseDir}/youtube-transcript.js "$VIDEO_ID" --format json | jq -r '
"Title: \(.title)",
"Author: \(.author)",
"Words: \(.word_count)",
"Entries: \(.total_entries)",
"Language: \(.language)\(.isAutoGenerated ? " (auto)" : "")"
'
Batch Download with Progress
#!/bin/bash
VIDEOS=("VIDEO1" "VIDEO2" "VIDEO3")
TOTAL=${#VIDEOS[@]}
for i in "${!VIDEOS[@]}"; do
id="${VIDEOS[$i]}"
echo "[$((i+1))/$TOTAL] Processing $id..."
{baseDir}/youtube-transcript.js "$id" --format json --output "data/${id}.json" 2>/dev/null
sleep 1 # Rate limit protection
done
Create Bilingual Subtitles
#!/bin/bash
VIDEO_ID="your-video-id"
# Get English and Spanish
{baseDir}/youtube-transcript.js "$VIDEO_ID" --language en --format srt > english.srt
echo "English ✓"
{baseDir}/youtube-transcript.js "$VIDEO_ID" --language es --format srt > spanish.srt
echo "Spanish ✓"
# Combine (requires ffmpeg)
ffmpeg -i video.mp4 -i english.srt -i spanish.srt \
-map 0:v -map 0:a -map 1:s:0 -map 2:s:0 \
-c:v copy -c:a copy -c:s mov_text \
"${VIDEO_ID}_bilingual.mp4"
echo "Bilingual video created ✓"
Performance Tips
1. Use Caching
First fetch: ~2-5 seconds
Cached fetch: ~100ms
# First time (slow)
{baseDir}/youtube-transcript.js VIDEO_ID
# Second time (fast - from cache)
{baseDir}/youtube-transcript.js VIDEO_ID
# Force refresh (slow)
{baseDir}/youtube-transcript.js VIDEO_ID --no-cache
2. Batch Processing with Delays
# Bad - might hit rate limits
for id in $IDS; do
{baseDir}/youtube-transcript.js "$id"
done
# Good - respects rate limits
for id in $IDS; do
{baseDir}/youtube-transcript.js "$id"
sleep 2
done
3. Parallel Processing (Limited)
# Process 2-3 at a time (don't exceed rate limit)
{baseDir}/youtube-transcript.js VIDEO1 &
{baseDir}/youtube-transcript.js VIDEO2 &
{baseDir}/youtube-transcript.js VIDEO3 &
wait
4. Output Format Selection
- Fastest:
plain(smallest output, fastest write) - Recommended:
textorjson(balanced) - For subtitles:
srtorvtt(industry standard)
Limitations
- No Private Videos: Requires public access
- No Age-Restricted: Some videos unavailable
- No Members-Only: Requires YouTube membership
- Livestream Lag: Captions may be delayed
- New Videos: Auto-generated captions take time
- Rate Limits: Max 60 requests/minute
- Large Transcripts: Limited to 500K characters
Notes
- Cached transcripts expire after 24 hours
- Auto-generated captions may have errors
- Manual captions are preferred when available
- Language codes follow YouTube's internal format
- SRT format uses comma for milliseconds (WebVTT uses period)
- TSV and CSV formats are UTF-8 encoded
- JSON output includes metadata for programmatic use
- Script is network-resilient with automatic retries
- Use
--outputto save directly to file (handles special characters) - STDERR contains progress messages and metadata
- STDOUT contains the actual transcript data
Weekly Installs
3
Repository
winsorllc/upgra…carnivalFirst Seen
13 days ago
Security Audits
Installed on
opencode3
gemini-cli3
claude-code3
github-copilot3
codex3
kimi-cli3