youtube-transcript

SKILL.md

YouTube Transcript Skill

Production-grade YouTube transcript extraction with comprehensive format support, intelligent caching, and resilient networking.

When to Use

USE this skill when:

  • Extracting transcripts from YouTube videos
  • Converting YouTube captions to SRT/VTT subtitle files
  • Analyzing video content via transcripts
  • Creating subtitles for downloaded videos
  • Batch processing multiple video transcripts
  • Needing transcripts in specific languages
  • Processing auto-generated captions

DON'T use this skill when:

  • Transcript not available (disabled by creator)
  • Video is private or age-restricted
  • Livestream that hasn't ended
  • Need speech-to-text from audio → Use transcribe
  • Need video frames → Use video-frames

Prerequisites

# Requires Node.js (already available)
node --version

# No additional dependencies required

Commands

Basic Usage

# Extract transcript with video ID
{baseDir}/youtube-transcript.js VIDEO_ID

# Extract with full URL
{baseDir}/youtube-transcript.js "https://www.youtube.com/watch?v=VIDEO_ID"

# Extract with short URL
{baseDir}/youtube-transcript.js "https://youtu.be/VIDEO_ID"

Output Formats

# Plain text with timestamps (default)
{baseDir}/youtube-transcript.js VIDEO_ID --format text
[0:00:00.00] Here is the transcript text
[0:00:05.32] More transcript content

# Plain text without timestamps
{baseDir}/youtube-transcript.js VIDEO_ID --format plain
Here is the transcript text More transcript content

# JSON with metadata
{baseDir}/youtube-transcript.js VIDEO_ID --format json
{
  "title": "Video Title",
  "author": "Channel Name",
  "language": "en",
  "isAutoGenerated": false,
  "transcript": [...]
}

# SRT subtitle format
{baseDir}/youtube-transcript.js VIDEO_ID --format srt > video.srt
1
00:00:00,000 --> 00:00:05,320
Here is the transcript text

2
00:00:05,320 --> 00:00:08,150
More transcript content

# VTT subtitle format
{baseDir}/youtube-transcript.js VIDEO_ID --format vtt > video.vtt
WEBVTT

1
00:00.000 --> 00:05.320
Here is the transcript text

# TSV tab-separated values
{baseDir}/youtube-transcript.js VIDEO_ID --format tsv
start\tduration\ttext
0.000\t5.320\tHere is the transcript text

# CSV comma-separated values
{baseDir}/youtube-transcript.js VIDEO_ID --format csv
start,duration,text
0.000,5.320,"Here is the transcript text"

Language Selection

# Auto-select best available (default)
{baseDir}/youtube-transcript.js VIDEO_ID

# Specific language by code
{baseDir}/youtube-transcript.js VIDEO_ID --language en
{baseDir}/youtube-transcript.js VIDEO_ID --language es
{baseDir}/youtube-transcript.js VIDEO_ID --language fr

# Partial matches work too
{baseDir}/youtube-transcript.js VIDEO_ID --language zh   # Matches zh-CN, zh-TW, etc.

# Language with auto-generated preference
{baseDir}/youtube-transcript.js VIDEO_ID --language ja --format srt

Common Language Codes:

Code Language
en English
es Spanish
fr French
de German
ja Japanese
ko Korean
zh Chinese
pt Portuguese
ru Russian
hi Hindi
ar Arabic
it Italian

Save to File

# Save transcript directly to file
{baseDir}/youtube-transcript.js VIDEO_ID --output transcript.txt
{baseDir}/youtube-transcript.js VIDEO_ID --format srt --output subtitles.srt
{baseDir}/youtube-transcript.js VIDEO_ID --format json --output data.json

# Shell redirection (equivalent)
{baseDir}/youtube-transcript.js VIDEO_ID --format vtt > captions.vtt

Advanced Options

# Skip cache (force fresh fetch)
{baseDir}/youtube-transcript.js VIDEO_ID --no-cache

# Verbose debugging output
DEBUG=1 {baseDir}/youtube-transcript.js VIDEO_ID

# Combine options
{baseDir}/youtube-transcript.js VIDEO_ID --language es --format srt --output spanish.srt --no-cache

Features

Format Comparison

Format Use Case Human Readable Machine Readable
text Default viewing ⚠️
plain Content only ⚠️
json API integration ⚠️
srt Subtitle files
vtt Web captions
tsv Spreadsheet import ⚠️
csv Database import ⚠️

Supported Video URL Formats

# Plain video ID (11 characters)
EBw7gsDPAYQ

# Standard YouTube URL
https://www.youtube.com/watch?v=EBw7gsDPAYQ

# Short youtu.be URL
https://youtu.be/EBw7gsDPAYQ

# Embed URL
https://www.youtube.com/embed/EBw7gsDPAYQ

# YouTube Live URL
https://www.youtube.com/live/EBw7gsDPAYQ

# URLs with additional parameters (automatically handled)
https://www.youtube.com/watch?v=EBw7gsDPAYQ&t=120s
https://www.youtube.com/watch?v=EBw7gsDPAYQ&index=2

# Playlist URLs (extracts first video)
https://www.youtube.com/watch?v=EBw7gsDPAYQ&list=...

Intelligent Caching

The skill implements intelligent caching to improve performance:

  • Cache Location: /tmp/youtube-transcript-cache/
  • TTL: 24 hours per entry
  • Max Entries: 100 videos
  • Benefits:
    • Instant retrieval of previously fetched transcripts
    • Reduced load on YouTube servers
    • Better performance for repeated operations

Cache Bypass:

# Force fresh fetch
{baseDir}/youtube-transcript.js VIDEO_ID --no-cache

Rate Limiting

To avoid being blocked by YouTube:

  • Max 60 requests per minute
  • Minimum 1 second delay between requests
  • Exponential backoff on retries

Retry Logic

When requests fail:

  1. First attempt
  2. Wait 2 seconds, retry
  3. Wait 4 seconds, retry
  4. Wait 6 seconds, retry
  5. Final error reported

Error Handling

Error Codes

Code Name Description Resolution
0 SUCCESS Transcript fetched None needed
1 INVALID_VIDEO_ID Bad URL/ID format double-check the video ID
2 VIDEO_NOT_FOUND Video doesn't exist Verify video exists
3 TRANSCRIPT_DISABLED Creator disabled captions Contact creator
4 NO_TRANSCRIPT No captions available Wait for transcript
5 VIDEO_UNAVAILABLE Can't access Check restrictions
6 PRIVATE_VIDEO Video is private Get access/permission
7 RATE_LIMITED Too many requests Wait before retry
8 NETWORK_ERROR Connection issue Check internet
9 PARSE_ERROR Data extraction failed Try again
99 UNKNOWN Unexpected error Report issue

Common Errors and Solutions

"Could not extract player data"

  • YouTube may have changed their page structure
  • The video may be age-restricted
  • The video may require login
  • Solution: Try again later or check if video is publicly accessible

"No captions available for this video"

  • Creator hasn't added captions
  • Auto-generated captions aren't ready (may take a few hours after upload)
  • Video is too new
  • Solution: Wait for YouTube to generate captions, or check if manual captions exist

"Rate limited by YouTube"

  • Too many requests in short period
  • Solution: Wait 1-2 minutes before retrying

"Transcript too long"

  • Video exceeds 500K characters
  • Solution: Use --format json which handles large transcripts better

"Video unavailable or not found"

  • Video removed or never existed
  • Region-restricted
  • Solution: Verify video ID/URL is correct

Technical Architecture

Data Flow

Video ID/URL
Extract Video ID ← URL parser (7+ formats)
Check Cache ← 24hr TTL store
    ↓[cache miss]
Fetch YouTube Page ← HTTP with retry logic
Extract Player Data ← ytInitialPlayerResponse
Parse Caption Tracks ← Language selection
Fetch Transcript ← Select appropriate URL
Parse Entries ← XML/JSON parsing
Format Output ← 7 output formats
Cache & Return ← Store for 24hr

Player Data Extraction

Extracts multiple potential sources:

  1. ytInitialPlayerResponse JavaScript variable
  2. playerResponse JSON in script tags
  3. Caption tracks from various locations

Transcript Parsing

Supports multiple formats:

  1. JSON API Response: Modern format
  2. Timed Text XML: Legacy format
  3. Alternative XML: Older structure
  4. Special handling for: Auto-generated vs manual captions

Data Unescaping

Properly handles:

  • &&
  • &lt;<
  • &gt;>
  • &quot;"
  • &#39; / &#039; / &apos;'
  • Whitespace normalization

Sample Output

JSON Format (Full)

{
  "title": "How Artificial Intelligence Works",
  "author": "Example Channel",
  "duration": "PT10M32S",
  "language": "en",
  "isAutoGenerated": true,
  "transcript": [
    {
      "start": 0.000,
      "duration": 5.320,
      "text": "In this video, we'll explore how AI systems learn and adapt"
    },
    {
      "start": 5.320,
      "duration": 4.180,
      "text": "to perform tasks that traditionally required human intelligence"
    }
  ],
  "word_count": 2847,
  "total_entries": 156
}

SRT Format (SubRip)

1
00:00:00,000 --> 00:00:05,320
In this video, we'll explore how AI systems
learn and adapt

2
00:00:05,320 --> 00:00:09,500
to perform tasks that traditionally
required human intelligence

3
00:00:09,500 --> 00:00:13,240
This process is called
machine learning

...

VTT Format (WebVTT)

WEBVTT

1
00:00.000 --> 00:05.320
In this video, we'll explore how AI systems
learn and adapt

2
00:05.320 --> 00:09.500
to perform tasks that traditionally
required human intelligence

...

Examples

Download Transcripts for Playlist

#!/bin/bash
# Process multiple videos from IDs file

for video_id in $(cat video_ids.txt); do
  echo "Processing: $video_id"
  
  {baseDir}/youtube-transcript.js "$video_id" --format srt --output "transcripts/${video_id}.srt" 2>/dev/null
  
  if [ $? -eq 0 ]; then
    echo "  ✓ Success"
  else
    echo "  ✗ Failed"
  fi
  
  # Sleep to respect rate limits
  sleep 2
done

Convert to PDF for Reading

#!/bin/bash
VIDEO_ID="EBw7gsDPAYQ"

# Get transcript
{baseDir}/youtube-transcript.js "$VIDEO_ID" --format plain > transcript.txt

# Convert to PDF (requires pandoc)
pandoc transcript.txt -o transcript.pdf
echo "PDF created: transcript.pdf"

Analyze Word Counts

#!/bin/bash
VIDEO_ID="EBw7gsDPAYQ"

# Get JSON format
{baseDir}/youtube-transcript.js "$VIDEO_ID" --format json | jq -r '
  "Title: \(.title)",
  "Author: \(.author)",
  "Words: \(.word_count)",
  "Entries: \(.total_entries)",
  "Language: \(.language)\(.isAutoGenerated ? " (auto)" : "")"
'

Batch Download with Progress

#!/bin/bash
VIDEOS=("VIDEO1" "VIDEO2" "VIDEO3")
TOTAL=${#VIDEOS[@]}

for i in "${!VIDEOS[@]}"; do
  id="${VIDEOS[$i]}"
  echo "[$((i+1))/$TOTAL] Processing $id..."
  
  {baseDir}/youtube-transcript.js "$id" --format json --output "data/${id}.json" 2>/dev/null
  
  sleep 1  # Rate limit protection
done

Create Bilingual Subtitles

#!/bin/bash
VIDEO_ID="your-video-id"

# Get English and Spanish
{baseDir}/youtube-transcript.js "$VIDEO_ID" --language en --format srt > english.srt
echo "English ✓"

{baseDir}/youtube-transcript.js "$VIDEO_ID" --language es --format srt > spanish.srt
echo "Spanish ✓"

# Combine (requires ffmpeg)
ffmpeg -i video.mp4 -i english.srt -i spanish.srt \
  -map 0:v -map 0:a -map 1:s:0 -map 2:s:0 \
  -c:v copy -c:a copy -c:s mov_text \
  "${VIDEO_ID}_bilingual.mp4"

echo "Bilingual video created ✓"

Performance Tips

1. Use Caching

First fetch: ~2-5 seconds
Cached fetch: ~100ms

# First time (slow)
{baseDir}/youtube-transcript.js VIDEO_ID

# Second time (fast - from cache)
{baseDir}/youtube-transcript.js VIDEO_ID

# Force refresh (slow)
{baseDir}/youtube-transcript.js VIDEO_ID --no-cache

2. Batch Processing with Delays

# Bad - might hit rate limits
for id in $IDS; do
  {baseDir}/youtube-transcript.js "$id"
done

# Good - respects rate limits
for id in $IDS; do
  {baseDir}/youtube-transcript.js "$id"
  sleep 2
done

3. Parallel Processing (Limited)

# Process 2-3 at a time (don't exceed rate limit)
{baseDir}/youtube-transcript.js VIDEO1 &
{baseDir}/youtube-transcript.js VIDEO2 &
{baseDir}/youtube-transcript.js VIDEO3 &
wait

4. Output Format Selection

  • Fastest: plain (smallest output, fastest write)
  • Recommended: text or json (balanced)
  • For subtitles: srt or vtt (industry standard)

Limitations

  1. No Private Videos: Requires public access
  2. No Age-Restricted: Some videos unavailable
  3. No Members-Only: Requires YouTube membership
  4. Livestream Lag: Captions may be delayed
  5. New Videos: Auto-generated captions take time
  6. Rate Limits: Max 60 requests/minute
  7. Large Transcripts: Limited to 500K characters

Notes

  • Cached transcripts expire after 24 hours
  • Auto-generated captions may have errors
  • Manual captions are preferred when available
  • Language codes follow YouTube's internal format
  • SRT format uses comma for milliseconds (WebVTT uses period)
  • TSV and CSV formats are UTF-8 encoded
  • JSON output includes metadata for programmatic use
  • Script is network-resilient with automatic retries
  • Use --output to save directly to file (handles special characters)
  • STDERR contains progress messages and metadata
  • STDOUT contains the actual transcript data
Weekly Installs
3
First Seen
13 days ago
Installed on
opencode3
gemini-cli3
claude-code3
github-copilot3
codex3
kimi-cli3