youtube-summarizer

SKILL.md

YouTube Summarizer Skill

Automatically fetch transcripts from YouTube videos, generate structured summaries, and deliver full transcripts to messaging platforms.

Mode

Detect from context or ask: "Quick TL;DR, full summary, or full summary with content angles?"

Mode What you get Best for
quick 3-bullet TL;DR + single key takeaway Fast consumption, sharing a clip
standard Full structured summary: thesis, insights, takeaway Learning, note-taking, research
deep Full summary + chapter breakdown + content repurposing opportunities Turning a video into a content asset

Default: standard — use quick if they just want the gist. Use deep if they want to extract the video into usable content.


Why This vs ChatGPT?

Problem with ChatGPT: It can't access YouTube transcripts directly. You have to manually copy/paste captions or use a third-party tool first, then feed the text to ChatGPT. Multi-step, clunky, loses video metadata.

This skill provides:

  1. One-step transcript extraction - Drop a YouTube URL, get the full transcript automatically
  2. Structured summarization - Consistent format (thesis → insights → takeaway) every time, not random bullet points
  3. Video metadata included - Title, channel, views, publish date embedded in summary
  4. Full transcript delivery - Saves timestamped transcript to file and sends to Telegram/chat platforms
  5. Works from VPS/cloud - Uses Android client emulation to bypass YouTube's cloud IP blocking (where yt-dlp fails)
  6. Multi-language support - Auto-fetches in requested language with English fallback

You can replicate this by manually enabling captions, copying text, pasting to ChatGPT, reformatting the output, saving to a file, and uploading. Takes 5-10 minutes. This skill does it in 15-20 seconds.

When to Use

Activate this skill when:

  • User shares a YouTube URL (youtube.com/watch, youtu.be, youtube.com/shorts)
  • User asks to summarize or transcribe a YouTube video
  • User requests information about a YouTube video's content
  • You need to analyze video content for research or content creation

Dependencies

Required: MCP YouTube Transcript server must be installed at: /root/clawd/mcp-server-youtube-transcript

If not present, install it:

cd /root/clawd
git clone https://github.com/kimtaeyoon83/mcp-server-youtube-transcript.git
cd mcp-server-youtube-transcript
npm install && npm run build

Workflow

1. Detect YouTube URL

Extract video ID from these patterns:

  • https://www.youtube.com/watch?v=VIDEO_ID
  • https://youtu.be/VIDEO_ID
  • https://www.youtube.com/shorts/VIDEO_ID
  • Direct video ID: VIDEO_ID (11 characters)

2. Fetch Transcript

Run this command to get the transcript:

cd /root/clawd/mcp-server-youtube-transcript && node --input-type=module -e "
import { getSubtitles } from './dist/youtube-fetcher.js';
const result = await getSubtitles({ videoID: 'VIDEO_ID', lang: 'en' });
console.log(JSON.stringify(result, null, 2));
" > /tmp/yt-transcript.json

Replace VIDEO_ID with the extracted ID. Read the output from /tmp/yt-transcript.json.

3. Process the Data

Parse the JSON to extract:

  • result.metadata.title - Video title
  • result.metadata.author - Channel name
  • result.metadata.viewCount - Formatted view count
  • result.metadata.publishDate - Publication date
  • result.actualLang - Language used
  • result.lines - Array of transcript segments

Full text: result.lines.map(l => l.text).join(' ')

4. Generate Summary

Create a structured summary using this template:

📹 **Video:** [title]
👤 **Channel:** [author] | 👁️ **Views:** [views] | 📅 **Published:** [date]

**🎯 Main Thesis:**
[1-2 sentence core argument/message]

**💡 Key Insights:**
- [insight 1]
- [insight 2]
- [insight 3]
- [insight 4]
- [insight 5]

**📝 Notable Points:**
- [additional point 1]
- [additional point 2]

**🔑 Takeaway:**
[Practical application or conclusion]

Aim for:

  • Main thesis: 1-2 sentences maximum
  • Key insights: 3-5 bullets, each 1-2 sentences
  • Notable points: 2-4 supporting details
  • Takeaway: Actionable conclusion

5. Save Full Transcript

Save the complete transcript to a timestamped file:

/root/clawd/transcripts/YYYY-MM-DD_VIDEO_ID.txt

Include in the file:

  • Video metadata header (title, channel, URL, date)
  • Full transcript text
  • URL reference for easy lookup

6. Platform-Specific Delivery

If channel is Telegram:

message --action send --channel telegram --target CHAT_ID \
  --filePath /root/clawd/transcripts/YYYY-MM-DD_VIDEO_ID.txt \
  --caption "📄 YouTube Transcript: [title]"

If channel is other/webchat: Just reply with the summary (no file attachment).

7. Reply with Summary

Send the structured summary as your response to the user.

Real Case Study

User: Content creator researching competitor YouTube strategies

Challenge: Needed to analyze 20+ competitor videos per week to identify trending topics, messaging patterns, and content gaps. Manual process: watch video, take notes, transcribe key quotes. Time: 30-45 min per video.

Solution with youtube-summarizer:

  1. Drop YouTube URL in chat
  2. Get structured summary in 20 seconds
  3. Full transcript saved for reference
  4. Copy key insights for content planning doc

Workflow example:

User: Analyze this video: https://youtube.com/watch?v=abc123
[20 seconds later]

📹 Video: "10 AI Tools That Will Replace Your Job in 2026"
👤 Channel: TechFuturist | 👁️ Views: 847K | 📅 Published: Jan 12, 2026

🎯 Main Thesis:
AI tools are automating creative and knowledge work faster than expected, but the real opportunity is in augmentation, not replacement.

💡 Key Insights:
- ChatGPT usage among marketers jumped from 12% to 67% in one year
- Video editing time reduced by 80% using AI tools like Descript
- The biggest wins come from combining tools (Notion + Claude + Zapier)
- Companies hiring "AI workflow designers" to optimize human-AI collaboration
- Workers using AI secretly outperform peers by 40% (BCG study)

📝 Notable Points:
- Shows examples of 3 small businesses that 10× output with AI
- Warns against over-automation: "AI can write, but can't think strategically"

🔑 Takeaway:
Don't ask "Will AI replace me?" Ask "How can I use AI to become 10× more valuable?"

Results after 8 weeks:

  • Time saved: 25 hours/week (from 600 min to 60 min for 20 videos)
  • Content output: 3 videos/week (up from 1/week)
  • Better insights: Full transcripts searchable, found patterns missed when just watching
  • Competitive intel: Built database of 160+ competitor video summaries with key quotes
  • ROI quote: "This skill turned competitor research from a chore into an assembly line."

Why This Beats Manual Methods

Method Time Gets Metadata Structured Output Searchable Archive Cloud-Friendly
Watch + take notes 30-45 min No No Manual only N/A
YouTube transcript feature 5 min No No No Yes
yt-dlp 2-5 min Yes No Yes ❌ Blocked on VPS
Copy to ChatGPT 10 min No Sometimes No Yes
This skill 20 sec Yes Yes Yes ✅ Works on VPS

Error Handling

If transcript fetch fails:

  • Check if video has captions enabled
  • Try with lang: 'en' fallback if requested language unavailable
  • Inform user that transcript is not available and suggest alternatives:
    • Manual YouTube transcript feature (Settings → Show transcript)
    • Video may not have captions
    • Try a different video

If MCP server not installed:

  • Provide installation instructions
  • Offer to install it automatically if in appropriate context

If video ID extraction fails:

  • Ask user to provide the full YouTube URL or video ID

If video is age-restricted or private:

  • Inform user that transcript cannot be accessed due to restrictions
  • Suggest checking video privacy settings

Examples

Example 1: Tech Tutorial

Input: https://youtube.com/watch?v=dQw4w9WgXcQ

Output:

📹 **Video:** "Building a SaaS from Scratch: Lessons from $10K MRR"
👤 **Channel:** IndieHackerTV | 👁️ **Views:** 124K | 📅 **Published:** Feb 1, 2026

**🎯 Main Thesis:**
Most SaaS founders fail because they build for 6 months before talking to customers. The path to $10K MRR is customer conversations first, MVP second.

**💡 Key Insights:**
- Interviewed 50 potential customers before writing a single line of code
- First paid customer signed up with a Figma mockup (no product built yet)
- Charged $99/month from day 1 (no free tier, no discounts)
- Spent $0 on ads; all growth from Twitter + Reddit engagement
- Hit $10K MRR in 9 months by saying "no" to feature requests that didn't fit ICP

**📝 Notable Points:**
- Used Stripe payment links before building a billing system
- First 3 customers came from solving their problem in public on Twitter
- Weekly "build in public" updates on Twitter drove 40% of signups

**🔑 Takeaway:**
Validate demand before building. If 10 people won't pay for a mockup, 1,000 won't pay for the real product.

Example 2: Business Strategy Video

Input: https://youtu.be/abc123xyz

Output:

📹 **Video:** "Why Notion's Business Model is Genius"
👤 **Channel:** SaaS Breakdowns | 👁️ **Views:** 456K | 📅 **Published:** Jan 28, 2026

**🎯 Main Thesis:**
Notion's growth strategy flips traditional SaaS: give away the product for free to individuals, monetize when they bring it to work.

**💡 Key Insights:**
- 80% of Notion's enterprise deals started with a single employee using the free plan
- Bottom-up adoption = zero sales team needed for first $10M ARR
- Templates marketplace created a content flywheel (100K+ free templates)
- Personal use (free) → Team use (paid) conversion rate: 23% (industry avg: 2-5%)
- Community evangelism replaced traditional marketing (4M+ Reddit/Discord members)

**📝 Notable Points:**
- Notion's viral coefficient: 1.4 (every user invites 1.4 others on average)
- Template creators drive 30% of new user acquisition
- Pricing strategy: free until 10 people = no friction to start

**🔑 Takeaway:**
Build a product individuals love first. Enterprise sales will follow when employees demand it at work.

Quality Guidelines

  • Be concise: Summary should be scannable in 30 seconds
  • Be accurate: Don't add information not in the transcript
  • Be structured: Use consistent formatting for easy reading
  • Be contextual: Adjust detail level based on video length
    • Short videos (<5 min): Brief summary (3 key insights)
    • Medium videos (5-30 min): Standard format (5 key insights)
    • Long videos (>30 min): Detailed breakdown (7+ insights, split into sections if needed)
  • Extract value: Focus on actionable insights, data points, and contrarian takes (not generic advice)

Pro Tips

For Better Summaries:

  1. Prioritize data points - Numbers, percentages, study citations stand out
  2. Extract quotes - Memorable one-liners make summaries shareable
  3. Identify frameworks - If video presents a method/process, extract the steps
  4. Spot contrarian takes - Unconventional wisdom is more valuable than common advice
  5. Note proof - Examples, case studies, before/after results add credibility

For Research Workflows:

  1. Build a transcript library - Organize by topic/niche for pattern spotting
  2. Search across transcripts - Use grep or text search to find mentions of specific topics
  3. Track trends - Same topic across multiple videos = rising trend
  4. Extract prompts - Save useful frameworks/methods as reusable prompts

For Content Creation:

  1. Find content gaps - What questions are asked but not fully answered?
  2. Analyze top performers - What structure/pacing do high-view videos use?
  3. Extract hooks - First 30 seconds of transcript = proven hook patterns
  4. Repurpose insights - Turn video insights into Twitter threads, blog posts, newsletters

Configuration

Standard Mode (default)

youtube-summarizer [URL]
  • Fetches transcript in English
  • Generates structured summary
  • Saves transcript to file
  • Sends to messaging platform if applicable

Quick Mode

youtube-summarizer [URL] --quick
  • Thesis + 3 key insights only
  • No transcript file saved
  • Faster processing for rapid research

Deep Dive Mode

youtube-summarizer [URL] --deep
  • Extended summary with timestamps
  • Section-by-section breakdown for long videos
  • Includes all notable quotes

Language-Specific

youtube-summarizer [URL] --lang es
  • Fetches transcript in specified language
  • Falls back to English if unavailable

Installation & Setup

# 1. Clone and install MCP server
cd /root/clawd
git clone https://github.com/kimtaeyoon83/mcp-server-youtube-transcript.git
cd mcp-server-youtube-transcript
npm install && npm run build

# 2. Test installation
node --input-type=module -e "
import { getSubtitles } from './dist/youtube-fetcher.js';
const result = await getSubtitles({ videoID: 'dQw4w9WgXcQ', lang: 'en' });
console.log(result.metadata.title);
"

# 3. Create transcripts directory
mkdir -p /root/clawd/transcripts

# 4. Verify skill is ready
youtube-summarizer --check-setup

Common Issues

Issue: "Transcript not available"

  • Cause: Video has no captions/subtitles enabled
  • Fix: Ask video creator to enable captions, or try a different video

Issue: "Failed to fetch transcript" (on VPS)

  • Cause: YouTube may have updated their API
  • Fix: Update MCP server: cd /root/clawd/mcp-server-youtube-transcript && git pull && npm install && npm run build

Issue: "Video ID not recognized"

  • Cause: Malformed URL or unsupported format
  • Fix: Copy URL directly from YouTube address bar

Future Enhancements (Roadmap)

  • Multi-video batch processing (analyze playlists)
  • Sentiment analysis on transcript (positive/negative/neutral tone)
  • Speaker diarization (identify different speakers in interviews/panels)
  • Automatic chapter detection (split long videos into logical sections)
  • Cross-video pattern analysis (find common themes across multiple videos)

Support

Issues or suggestions? Provide:

  • YouTube URL that failed
  • Error message (if any)
  • Expected vs actual behavior
  • MCP server version: cd /root/clawd/mcp-server-youtube-transcript && git rev-parse HEAD

Built on MCP YouTube Transcript server (Android emulation for cloud reliability). Turn any YouTube video into structured, searchable knowledge in 20 seconds.

Weekly Installs
41
GitHub Stars
148
First Seen
Feb 15, 2026
Installed on
opencode38
gemini-cli38
github-copilot38
codex38
amp37
kimi-cli37