search-youtube
YouTube Research
Multi-platform video research toolkit. Operates in two modes: toolkit (individual operations)
and research (autonomous search-to-synthesis pipeline). All operations use a single CLI at
${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py.
Run python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py <subcommand> --help
for full flag reference on any subcommand.
Toolkit Mode
Invoke individual subcommands for targeted operations. Default mode when the user requests a specific action (transcript, search, metadata, audio, channel scan).
Search
Find videos matching a query. Returns structured results with metadata.
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py search "<query>" --count 10
Add filters to narrow results: --min-duration 600 (seconds), --after 20250101 (YYYYMMDD),
--min-views 50000. Filters are applied client-side after fetching, so the tool over-fetches
automatically to compensate. Output is JSON by default; add -f text for human-readable.
Transcript
Download and clean subtitles to LLM-ready text.
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py transcript "<url>"
Outputs clean prose to stdout by default. Add --timestamps for SRT with timing cues.
Add --save -t <topic> to persist to ~/youtube-research/<topic>/. Use --lang all to
list available subtitle languages before downloading. Fallback chain: manual subs then
auto-generated. Exit 4 if no subtitles exist in the requested language.
Metadata
Extract full video information without downloading.
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py metadata "<url>"
Returns: title, description, channel, duration, chapters, view/like counts, tags, available
subtitle languages, thumbnail URL. Add --playlist for playlist entry listings.
Audio
Download audio in the requested format.
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py audio "<url>" --audio-format mp3 -t <topic>
Saves to ~/youtube-research/<topic>/audio/. Supported formats: mp3, m4a, opus, wav.
Always saves to disk (audio cannot go to stdout). Prints the file path on success.
Channel
Scan a channel's content.
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py channel "<url-or-@handle>" --limit 20
Supports tabs: --tab videos (default), shorts, streams, playlists. Filter with
--after/--before (YYYYMMDD). Sort with --sort views for most-viewed-first.
Batch Processing
Any subcommand except search accepts --batch <file> (or --batch - for stdin) to
process multiple URLs. One URL per line; lines starting with # or ; are skipped.
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py transcript --batch urls.txt --save -t <topic>
Research Mode
Activate when the user asks to "research", "investigate", or "find out about" a topic using YouTube as a source. This is an adaptive multi-round discovery pipeline designed for niche and emerging topics where popular videos often under-serve.
Round 1: Divergent Search
Generate 4-6 query variants that cover different angles of the topic:
- Exact tool/concept name (e.g.,
"openclaw") - Tool + ecosystem context (e.g.,
"openclaw claude code") - Problem the tool solves (e.g.,
"claude code documentation framework") - Workflow/demo framing (e.g.,
"openclaw walkthrough demo") - Alternative names, abbreviations, or common misspellings if applicable
Spawn one Task agent per query variant simultaneously, each running:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py search "<query-variant>" --count 15
Collect all results and deduplicate by video ID. Aim for 30-50 unique candidates across all threads.
Round 1 Evaluation: Niche-First Heuristics
For niche, emerging, or edge-of-tech topics, these signals predict quality:
Positive signals (use these):
- Title specifically names the tool or concept (not "top AI tools 2025")
- Small channel (< 50K subscribers) — for new tech, practitioners publish before educators discover the topic
- Technical, specific description (mentions code, config, architecture, or commands)
- Structured content: chapters, timestamps, or detailed description
- Recent upload date
Negative signals (treat as red flags on niche topics):
- View count > 100K — on a narrow topic this usually means beginner-level or clickbait
- "Tutorial for beginners" / "complete guide" in the title for brand-new tools
- Large generalist channel covering many unrelated topics
Select 6-10 candidates from the combined pool. Note which channels produced the strongest results — those are targets for Round 2.
Round 2: Channel Discovery and Refinement
For each channel that surfaced a strong Round 1 result, scan its recent videos:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py channel "<channel-url>" --limit 20
Also run 1-2 refined search queries using specific terminology that appeared in strong Round 1 titles or descriptions. Add any new candidates to the pool.
Quality gate: If Round 1 candidates are weak (generic titles, all high-view generalist content, nothing specifically about the topic), surface this to the user and run another search round with reformulated queries before proceeding to transcripts.
Round 3: Confirm and Transcribe
From the enriched candidate pool, select 4-7 videos using the Round 1 criteria. Extract metadata to confirm relevance before committing to downloads:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py metadata "<url>"
Download transcripts in parallel — always spawn one Task agent per video, even for 2:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py transcript "<url>" --save -t <topic>
Read each transcript and synthesize findings into a report. Use WebSearch or brave-cli
to cross-reference claims or fill gaps when YouTube sources disagree or leave questions
unanswered.
Research Report Format
Present the synthesis as a structured markdown report:
- Title and one-paragraph summary of the research question
- Key findings (3-7 bullet points of the most important takeaways across all sources)
- Points of agreement between sources (what multiple videos confirm)
- Points of disagreement (where sources contradict, with attribution)
- Unique insights (notable points from individual videos not repeated elsewhere)
- Gaps in coverage (what the sources collectively missed)
- Sources table: video title, channel, duration, and URL for each video used
Attribute specific claims to their source video. Include timestamps when the transcript
preserves them. Save all transcripts under the topic directory for future reference.
See examples/research-report.md for a sample report structure.
Research Composability
See references/cli-reference.md for pipeline patterns that chain subcommands with
standard Unix tools (search → jq → batch transcript).
Error Recovery
| Exit Code | Meaning | Recovery Action |
|---|---|---|
| 0 | Success | — |
| 1 | Usage error | Check --help for correct syntax |
| 2 | yt-dlp not found | Tell user to install: pip install yt-dlp |
| 3 | Network/download error | Check URL validity; try --cookies <browser> for private/restricted content |
| 4 | No results | For transcripts: try --lang all to list available languages. For search: broaden query or remove filters |
Platform Notes
Load references/platforms.md when processing a non-YouTube URL or when a yt-dlp command
fails with exit code 3 on an unfamiliar platform. YouTube is the primary platform, but any
yt-dlp-supported URL works (Vimeo, Twitter, Twitch, etc.).
After extracting a transcript, read the output and summarize key points for the user unless they asked for raw output only.