youtube-summarizer
youtube-summarizer
Purpose
This skill extracts transcripts from YouTube videos and generates comprehensive, verbose summaries using the STAR + R-I-S-E framework. It validates video availability, extracts transcripts using the youtube-transcript-api Python library, and produces detailed documentation capturing all insights, arguments, and key points.
The skill is designed for users who need thorough content analysis and reference documentation from educational videos, lectures, tutorials, or informational content.
When to Use This Skill
This skill should be used when:
- User provides a YouTube video URL and wants a detailed summary
- User needs to document video content for reference without rewatching
- User wants to extract insights, key points, and arguments from educational content
- User needs transcripts from YouTube videos for analysis
- User asks to "summarize", "resume", or "extract content" from YouTube videos
- User wants comprehensive documentation prioritizing completeness over brevity
Step 0: Discovery & Setup
Before processing videos, validate the environment and dependencies:
# Check if youtube-transcript-api is installed
python3 -c "import youtube_transcript_api" 2>/dev/null
if [ $? -ne 0 ]; then
echo "⚠️ youtube-transcript-api not found"
# Offer to install
fi
# Check Python availability
if ! command -v python3 &>/dev/null; then
echo "❌ Python 3 is required but not installed"
exit 1
fi
Ask the user if dependency is missing:
youtube-transcript-api is required but not installed.
Would you like to install it now?
- [ ] Yes - Install with pip (pip install youtube-transcript-api)
- [ ] No - I'll install it manually
If user selects "Yes":
pip install youtube-transcript-api
Verify installation:
python3 -c "import youtube_transcript_api; print('✅ youtube-transcript-api installed successfully')"
Main Workflow
Progress Tracking Guidelines
Throughout the workflow, display a visual progress gauge before each step to keep the user informed. The gauge format is:
echo "[████░░░░░░░░░░░░░░░░] 20% - Step 1/5: Validating URL"
Format specifications:
- 20 characters wide (use █ for filled, ░ for empty)
- Percentage increments: Step 1=20%, Step 2=40%, Step 3=60%, Step 4=80%, Step 5=100%
- Step counter showing current/total (e.g., "Step 3/5")
- Brief description of current phase
Display the initial status box before Step 1:
╔══════════════════════════════════════════════════════════════╗
║ 📹 YOUTUBE SUMMARIZER - Processing Video ║
╠══════════════════════════════════════════════════════════════╣
║ → Step 1: Validating URL [IN PROGRESS] ║
║ ○ Step 2: Checking Availability ║
║ ○ Step 3: Extracting Transcript ║
║ ○ Step 4: Generating Summary ║
║ ○ Step 5: Formatting Output ║
╠══════════════════════════════════════════════════════════════╣
║ Progress: ██████░░░░░░░░░░░░░░░░░░░░░░░░ 20% ║
╚══════════════════════════════════════════════════════════════╝
Step 1: Validate YouTube URL
Objective: Extract video ID and validate URL format.
Supported URL Formats:
https://www.youtube.com/watch?v=VIDEO_IDhttps://youtube.com/watch?v=VIDEO_IDhttps://youtu.be/VIDEO_IDhttps://m.youtube.com/watch?v=VIDEO_ID
Actions:
# Extract video ID using regex or URL parsing
URL="$USER_PROVIDED_URL"
# Pattern 1: youtube.com/watch?v=VIDEO_ID
if echo "$URL" | grep -qE 'youtube\.com/watch\?v='; then
VIDEO_ID=$(echo "$URL" | sed -E 's/.*[?&]v=([^&]+).*/\1/')
# Pattern 2: youtu.be/VIDEO_ID
elif echo "$URL" | grep -qE 'youtu\.be/'; then
VIDEO_ID=$(echo "$URL" | sed -E 's/.*youtu\.be\/([^?]+).*/\1/')
else
echo "❌ Invalid YouTube URL format"
exit 1
fi
echo "📹 Video ID extracted: $VIDEO_ID"
If URL is invalid:
❌ Invalid YouTube URL
Please provide a valid YouTube URL in one of these formats:
- https://www.youtube.com/watch?v=VIDEO_ID
- https://youtu.be/VIDEO_ID
Example: https://www.youtube.com/watch?v=dQw4w9WgXcQ
Step 2: Check Video & Transcript Availability
Progress:
echo "[████████░░░░░░░░░░░░] 40% - Step 2/5: Checking Availability"
Objective: Verify video exists and transcript is accessible.
Actions:
from youtube_transcript_api import YouTubeTranscriptApi, TranscriptsDisabled, NoTranscriptFound
import sys
video_id = sys.argv[1]
try:
# Get list of available transcripts
transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
print(f"✅ Video accessible: {video_id}")
print("📝 Available transcripts:")
for transcript in transcript_list:
print(f" - {transcript.language} ({transcript.language_code})")
if transcript.is_generated:
print(" [Auto-generated]")
except TranscriptsDisabled:
print(f"❌ Transcripts are disabled for video {video_id}")
sys.exit(1)
except NoTranscriptFound:
print(f"❌ No transcript found for video {video_id}")
sys.exit(1)
except Exception as e:
print(f"❌ Error accessing video: {e}")
sys.exit(1)
Error Handling:
| Error | Message | Action |
|---|---|---|
| Video not found | "❌ Video does not exist or is private" | Ask user to verify URL |
| Transcripts disabled | "❌ Transcripts are disabled for this video" | Cannot proceed |
| No transcript available | "❌ No transcript found (not auto-generated or manually added)" | Cannot proceed |
| Private/restricted video | "❌ Video is private or restricted" | Ask for public video |
Step 3: Extract Transcript
Progress:
echo "[████████████░░░░░░░░] 60% - Step 3/5: Extracting Transcript"
Objective: Retrieve transcript in preferred language.
Actions:
from youtube_transcript_api import YouTubeTranscriptApi
video_id = "VIDEO_ID"
try:
# Try to get transcript in user's preferred language first
# Fall back to English if not available
transcript = YouTubeTranscriptApi.get_transcript(
video_id,
languages=['pt', 'en'] # Prefer Portuguese, fallback to English
)
# Combine transcript segments into full text
full_text = " ".join([entry['text'] for entry in transcript])
# Get video metadata
from youtube_transcript_api import YouTubeTranscriptApi
transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
print("✅ Transcript extracted successfully")
print(f"📊 Transcript length: {len(full_text)} characters")
# Save to temporary file for processing
with open(f"/tmp/transcript_{video_id}.txt", "w") as f:
f.write(full_text)
except Exception as e:
print(f"❌ Error extracting transcript: {e}")
exit(1)
Transcript Processing:
- Combine all transcript segments into coherent text
- Preserve punctuation and formatting where available
- Remove duplicate or overlapping segments (if auto-generated artifacts)
- Store in temporary file for analysis
Step 4: Generate Comprehensive Summary
Progress:
echo "[████████████████░░░░] 80% - Step 4/5: Generating Summary"
Objective: Apply enhanced STAR + R-I-S-E prompt to create detailed summary.
Prompt Applied:
Use the enhanced prompt from Phase 2 (STAR + R-I-S-E framework) with the extracted transcript as input.
Actions:
- Load the full transcript text
- Apply the comprehensive summarization prompt
- Use AI model (Claude/GPT) to generate structured summary
- Ensure output follows the defined structure:
- Header with video metadata
- Executive synthesis
- Detailed section-by-section breakdown
- Key insights and conclusions
- Concepts and terminology
- Resources and references
Implementation:
# Use the transcript file as input to the AI prompt
TRANSCRIPT_FILE="/tmp/transcript_${VIDEO_ID}.txt"
# The AI agent will:
# 1. Read the transcript
# 2. Apply the STAR + R-I-S-E summarization framework
# 3. Generate comprehensive Markdown output
# 4. Structure with headers, lists, and highlights
Read "$TRANSCRIPT_FILE" # Read transcript into context
Then apply the full summarization prompt (from enhanced version in Phase 2).
Step 5: Format and Present Output
Progress:
echo "[████████████████████] 100% - Step 5/5: Formatting Output"
Objective: Deliver the summary in clean, well-structured Markdown.
Output Structure:
# [Video Title]
**Canal:** [Channel Name]
**Duração:** [Duration]
**URL:** [https://youtube.com/watch?v=VIDEO_ID]
**Data de Publicação:** [Date if available]
## 📝 Detailed Summary
### [Topic 1]
[Comprehensive explanation with examples, data, quotes...]
#### [Subtopic 1.1]
[Detailed breakdown...]
### [Topic 2]
[Continued detailed analysis...]
## 📚 Concepts and Terminology
- **[Term 1]:** [Definition and context]
- **[Term 2]:** [Definition and context]
## 📌 Conclusion
[Final synthesis and takeaways]
### **Example 2: Missing Dependency**
**User Input:**
claude> summarize this youtube video https://youtu.be/abc123
**Skill Response:**
⚠️ youtube-transcript-api not installed
This skill requires the Python library 'youtube-transcript-api'.
Would you like me to install it now?
- Yes - Install with pip
- No - I'll install manually
**User selects "Yes":**
```bash
$ pip install youtube-transcript-api
Successfully installed youtube-transcript-api-0.6.1
✅ Installation complete! Proceeding with video summary...
Example 4: Invalid URL
User Input:
claude> summarize youtube video www.youtube.com/some-video
Skill Response:
❌ Invalid YouTube URL format
Expected format examples:
- https://www.youtube.com/watch?v=VIDEO_ID
- https://youtu.be/VIDEO_ID
Please provide a valid YouTube video URL.
📊 Executive Summary
This video provides a comprehensive introduction to the fundamental concepts of Artificial Intelligence (AI), designed for beginners and professionals who want to understand the technical foundations and practical applications of modern AI. The instructor covers everything from basic definitions to machine learning algorithms, using practical examples and visualizations to facilitate understanding.
[... continued detailed summary ...]
**Save Options:**
What would you like to save? → Summary + raw transcript
✅ File saved: resumo-exemplo123-2026-02-01.md (includes raw transcript) [████████████████████] 100% - ✓ Processing complete!
Welcome to this comprehensive tutorial on machine learning fundamentals. In today's video, we'll explore the core concepts that power modern AI systems...
Version: 1.2.0 Last Updated: 2026-02-02 Maintained By: Eric Andrade