NotebookLM Automation

Automate Google NotebookLM: create notebooks, add sources, chat with content, generate artifacts (podcasts, videos, quizzes), and download results.

Prerequisites

IMPORTANT: Before using any command, you MUST authenticate:

notebooklm login          # Opens browser for Google OAuth
notebooklm list           # Verify authentication works

If commands fail with authentication errors, re-run notebooklm login.

CI/CD, Multiple Accounts, and Parallel Agents

For automated environments, multiple accounts, or parallel agent workflows:

Variable	Purpose
`NOTEBOOKLM_HOME`	Custom config directory (default: `~/.notebooklm`)
`NOTEBOOKLM_AUTH_JSON`	Inline auth JSON - no file writes needed

CI/CD setup: Set NOTEBOOKLM_AUTH_JSON from a secret containing your storage_state.json contents.

Multiple accounts: Use different NOTEBOOKLM_HOME directories per account.

Parallel agents: The CLI stores notebook context in a shared file (~/.notebooklm/context.json). Multiple concurrent agents using notebooklm use can overwrite each other's context.

Solutions for parallel workflows:

Always use explicit notebook ID (recommended): Pass -n <notebook_id> (for wait/download commands) or --notebook <notebook_id> (for others) instead of relying on use
Per-agent isolation: Set unique NOTEBOOKLM_HOME per agent: export NOTEBOOKLM_HOME=/tmp/agent-$ID
Use full UUIDs: Avoid partial IDs in automation (they can become ambiguous)

Agent Setup Verification

Before starting workflows, verify the CLI is ready:

notebooklm status → Should show "Authenticated as: email@..."
notebooklm list --json → Should return valid JSON (even if empty notebooks list)
If either fails → Run notebooklm login

When This Skill Activates

Explicit: User says "/notebooklm", "use notebooklm", or mentions the tool by name

Intent detection: Recognize requests like:

"Create a podcast about [topic]"
"Summarize these URLs/documents"
"Generate a quiz from my research"
"Turn this into an audio overview"
"Add these sources to NotebookLM"
"Create a notebook from [Twitter/X link]"
"Extract content from this [social media post]"

Progressive fallback strategy (graceful degradation):

First attempt: Regular URL addition via notebooklm source add "url" (always try first)
Detect failure conditions: Monitor for errors like:
- Anti-crawler protection (Twitter/X, LinkedIn)
- Login required errors
- 403/404 access denied
- Processing failures/timeouts
Graceful fallback: Only then use browser extraction:
- Navigate via Playwright
- Extract content using JavaScript
- Save to /tmp/ before adding to NotebookLM
User notification: Always inform user of the method used:
- "✓ Added via direct URL" (success with regular method)
- "⚠️ Required browser extraction (anti-crawler)" (fallback used)

Autonomy Rules

Run automatically (no confirmation):

notebooklm status - check context
notebooklm list - list notebooks
notebooklm source list - list sources
notebooklm artifact list - list artifacts
notebooklm artifact wait - wait for artifact completion (in subagent context)
notebooklm source wait - wait for source processing (in subagent context)
notebooklm research status - check research status
notebooklm research wait - wait for research (in subagent context)
notebooklm use <id> - set context (⚠️ SINGLE-AGENT ONLY - use -n flag in parallel workflows)
notebooklm create - create notebook
notebooklm ask "..." - chat queries
notebooklm source add - add sources

Smart Decision Framework (Pragmatic Approach)

Installation Status System:

# Installation flag - prevents redundant checks
INSTALL_FLAG="$HOME/.cache/notebooklm/.playwright_installed"

# Function to check installation status (only once per session)
check_install_status() {
  # Fast check only if not explicitly checking
  if [[ "$1" == "--check-install-only" ]]; then
    if [ -f "$INSTALL_FLAG" ]; then
      echo "✓ Playwright is marked as installed"
      return 0
    else
      echo "⚠️ Playwright not marked, checking installation..."
      # Here you have options:
      # Option A: Quick check (try once)
      # Option B: Full installation verification
      
      # Option A (recommended): Quick check
      if timeout 3 python3 -c "import playwright" 2>/dev/null; then
        echo "✓ Playwright Python library installed"
        touch "$INSTALL_FLAG"
        return 0
      fi
      
      # Option B: Full verification (if needed)
      # pip3 install playwright
      # python3 -m playwright install chromium
  fi
}

Platform and Content Detection:

# Function: Platform type detection
detect_platform_type() {
  local url="$1"
  
  # Twitter/X: Always anti-crawler
  if [[ "$url" =~ (twitter\.com|x\.com) ]]; then
    echo "ANTICRAWLER_TWITTER"
    return 0
  
  # Other known anti-crawler platforms
  elif [[ "$url" =~ (linkedin\.com|facebook\.com|instagram\.com) ]]; then
    echo "ANTICRAWLER_OTHER"
    return 1
  
  # Open web (could support direct addition)
  else
    echo "OPEN_WEB"
    return 2
  fi
}

Browser-based extraction for anti-crawler URLs (Twitter/X, LinkedIn, etc.):
- Automatically detect protected platforms
- Use Playwright automation to extract content
- Save to /tmp/ before adding to NotebookLM
- No confirmation needed for extraction process

Ask before running:

notebooklm delete - destructive
notebooklm generate * - long-running, may fail
notebooklm download * - writes to filesystem
notebooklm artifact wait - long-running (when in main conversation)
notebooklm source wait - long-running (when in main conversation)
notebooklm research wait - long-running (when in main conversation)

Quick Reference

Task	Command
Authenticate	`notebooklm login`
List notebooks	`notebooklm list`
Create notebook	`notebooklm create "Title"`
Set context	`notebooklm use <notebook_id>`
Show context	`notebooklm status`
Add URL source	`notebooklm source add "https://..."`
Add file	`notebooklm source add ./file.pdf`
Add YouTube	`notebooklm source add "https://youtube.com/..."`
Extract Twitter/X via browser	Use Playwright skill → Save → `notebooklm source add ./file.md`
Extract anti-crawler content	Use browser automation → Save → Add to notebook
List sources	`notebooklm source list`
Wait for source processing	`notebooklm source wait <source_id>`
Web research (fast)	`notebooklm source add-research "query"`
Web research (deep)	`notebooklm source add-research "query" --mode deep --no-wait`
Check research status	`notebooklm research status`
Wait for research	`notebooklm research wait --import-all`
Chat	`notebooklm ask "question"`
Chat (new conversation)	`notebooklm ask "question" --new`
Chat (specific sources)	`notebooklm ask "question" -s src_id1 -s src_id2`
Generate podcast	`notebooklm generate audio "instructions"`
Generate podcast (specific sources)	`notebooklm generate audio -s src_id1 -s src_id2`
Generate video	`notebooklm generate video "instructions"`
Generate quiz	`notebooklm generate quiz`
Check artifact status	`notebooklm artifact list`
Wait for completion	`notebooklm artifact wait <artifact_id>`
Download audio	`notebooklm download audio ./output.mp3`
Download video	`notebooklm download video ./output.mp4`
Delete notebook	`notebooklm notebook delete <id>`

Parallel safety: Use explicit notebook IDs in parallel workflows. Commands supporting -n shorthand: artifact wait, source wait, research wait/status, download *. Other commands use --notebook. For chat, use --new to start fresh conversations (avoids conversation ID conflicts).

Partial IDs: Use first 6+ characters of UUIDs. Must be unique prefix (fails if ambiguous). Works for: use, delete, wait commands. For automation, prefer full UUIDs to avoid ambiguity.

Command Output Formats

Commands with --json return structured data for parsing:

Create notebook:

$ notebooklm create "Research" --json
{"id": "abc123de-...", "title": "Research"}

Add source:

$ notebooklm source add "https://example.com" --json
{"source_id": "def456...", "title": "Example", "status": "PROCESSING"}

Generate artifact:

$ notebooklm generate audio "Focus on key points" --json
{"artifact_id": "xyz789...", "status": "PENDING", "type": "AUDIO_OVERVIEW"}

Extract IDs: Parse the id, source_id, or artifact_id field from JSON output.

Generation Types

All generate commands support -s, --source to use specific source(s) instead of all sources.

Type	Command	Downloadable
Podcast	`generate audio`	Yes (.mp3)
Video	`generate video`	Yes (.mp4)
Slides	`generate slide-deck`	Yes (.pdf)
Infographic	`generate infographic`	Yes (.png)
Quiz	`generate quiz`	No (view in UI)
Flashcards	`generate flashcards`	No (view in UI)
Mind Map	`generate mind-map`	No (view in UI)
Data Table	`generate data-table`	No (export to Sheets)
Report	`generate report`	No (export to Docs)

Common Workflows

Smart Decision Framework (Pragmatic Approach)

Installation Status System:

# Installation flag - prevents redundant checks
INSTALL_FLAG="$HOME/.cache/notebooklm/.playwright_installed"

# Function to check installation status (only once per session)
check_install_status() {
  # Fast check only if not explicitly checking
  if [[ "$1" == "--check-install-only" ]]; then
    if [ -f "$INSTALL_FLAG" ]; then
      echo "✓ Playwright is marked as installed"
      return 0
    else
      echo "⚠️ Playwright not marked, checking installation..."
      # Here you have options:
      # Option A: Quick check (try once)
      # Option B: Full installation verification
      
      # Option A (recommended): Quick check
      if timeout 3 python3 -c "import playwright" 2>/dev/null; then
        echo "✓ Playwright Python library installed"
        touch "$INSTALL_FLAG"
        return 0
      fi
      
      # Option B: Full verification (if needed)
      # pip3 install playwright
      # python3 -m playwright install chromium
      # touch "$INSTALL_FLAG"
      # echo "✓ Installation complete"
      # fi
  fi
  # Auto-install if missing and not checking
  if ! check_install_status && command -v python3 -c "import playwright" 2>/dev/null; then
    echo "📦 Installing Playwright..."
    pip3 install playwright
    python3 -m playwright install chromium
    touch "$INSTALL_FLAG"
    echo "✓ Installation complete"
  fi
}

Platform and Content Detection:

# Function: Platform type detection
detect_platform_type() {
  local url="$1"
  
  # Twitter/X: Always anti-crawler
  if [[ "$url" =~ (twitter\.com|x\.com) ]]; then
    echo "ANTICRAWLER_TWITTER"
    return 0
  
  # Other known anti-crawler platforms
  elif [[ "$url" =~ (linkedin\.com|facebook\.com|instagram\.com) ]]; then
    echo "ANTICRAWLER_OTHER"
    return 1
  
  # Open web (could support direct addition)
  else
    echo "OPEN_WEB"
    return 2
  fi
}

Smart Extraction Workflow (Scene-Based)

Scenario 1: Twitter/X (Anti-Crawler)

# Direct path: Skip regular method, use browser extraction immediately
extract_twitter_content() {
  local url="$1"
  
  # Navigate
  skill_mcp playwright browser_navigate --arguments "{\"url\": \"$url\"}"
  
  # Wait and extract with multi-selector strategy
  skill_mcp playwright browser_run_code --arguments '{
    "code": "async (page) => {
      await page.waitForTimeout(6000);
      
      // Multi-selector priority
      const selectors = [
        '\''article [data-testid=\"tweetText\"]\'',  // Primary: tweet text
        '\''div[data-testid=\"tweet\"] span\'',          // Fallback 1
        '\''[role=\"article\"]\''                   // Fallback 2
      ];
      
      let content = null;
      for (const selector of selectors) {
        try {
          const element = await page.locator(selector).first();
          if (await element.count() > 0) {
            content = await element.textContent();
            if (content && content.trim().length > 0) {
              console.log(`Found with selector: ${selector}`);
              break;
            }
          }
        } catch (e) {
          // Try next selector
        }
      }
      
      // Ultimate fallback
      if (!content) {
        content = await page.evaluate(() => {
          const articles = document.querySelectorAll("article");
          return Array.from(articles).map(a => a.innerText).join("\\n---\\n");
        });
      }
      
      return {
        url: page.url(),
        title: await page.title(),
        content: content?.trim()
      };
    }"
  }'
  
  # Save to file and add to NotebookLM
  local filename="/tmp/twitter_extract_$(date +%s).md"
  echo "$EXTRACTED_CONTENT" > "$filename"
  notebooklm source add "$filename" --notebook "$NOTEBOOK_ID"
  
  echo "⚠️ Browser extraction used (anti-crawler protection)"
}

# Usage
extract_twitter_content "https://x.com/user/status/123"

Scenario 2: Open Web (Regular Method First)

# Try regular method first
try_regular_add() {
  local url="$1"
  local result=$(notebooklm source add "$url" --notebook "$NOTEBOOK_ID" --json)
  
  local status=$(echo "$result" | jq -r '.source.status // empty')
  
  case "$status" in
    READY|PROCESSING)
      echo "✓ Regular method succeeded"
      return 0
      ;;
    FAILED|AUTH_REQUIRED)
      echo "⚠️ Failed - trying browser extraction"
      # Fall back to browser extraction
      extract_content_via_browser "$url"
      ;;
    *)
      echo "❓ Unknown status, trying browser"
      extract_content_via_browser "$url"
      ;;
  esac
}

Scenario 3: Smart Batch Processing

# Process multiple URLs with intelligent method selection per URL
process_urls_intelligent() {
  local urls=("$@")
  local notebook_id="$1"
  
  for url in "${urls[@]}"; do
    local platform=$(detect_platform_type "$url")
    
    case "$platform" in
      ANTICRAWLER_TWITTER|ANTICRAWLER_OTHER)
        # Direct to browser (skip regular method)
        extract_content_via_browser "$url"
        ;;
      
      OPEN_WEB)
        # Try regular first
        try_regular_add "$url"
        ;;
    esac
  done
}

Research to Podcast (Interactive)

Time: 5-10 minutes total

notebooklm create "Research: [topic]" — if fails: check auth with notebooklm login
notebooklm source add for each URL/document — if one fails: log warning, continue with others
Wait for sources: notebooklm source list --json until all status=READY — required before generation
notebooklm generate audio "Focus on [specific angle]" (confirm when asked) — if rate limited: wait 5 min, retry once
Note the artifact ID returned
Check notebooklm artifact list later for status
notebooklm download audio ./podcast.mp3 when complete (confirm when asked)

Research to Podcast (Automated with Subagent)

Time: 5-10 minutes, but continues in background

When user wants full automation (generate and download when ready):

Create notebook and add sources as usual
Wait for sources to be ready (use source wait or check source list --json)
Run notebooklm generate audio "..." --json → parse artifact_id from output

Spawn a background agent using Task tool:

Task(
  prompt="Wait for artifact {artifact_id} in notebook {notebook_id} to complete, then download.
          Use: notebooklm artifact wait {artifact_id} -n {notebook_id} --timeout 600
          Then: notebooklm download audio ./podcast.mp3 -a {artifact_id} -n {notebook_id}",
  subagent_type="general-purpose"
)

Main conversation continues while agent waits

Error handling in subagent:

If artifact wait returns exit code 2 (timeout): Report timeout, suggest checking artifact list
If download fails: Check if artifact status is COMPLETED first

Benefits: Non-blocking, user can do other work, automatic download on completion

Document Analysis

Time: 1-2 minutes

notebooklm create "Analysis: [project]"
notebooklm source add ./doc.pdf (or URLs)
notebooklm ask "Summarize the key points"
notebooklm ask "What are the main arguments?"
Continue chatting as needed

Bulk Import with Progressive Fallback

Time: Varies by source count

notebooklm create "Collection: [name]"

Add multiple sources (progressive fallback applied automatically):

 # Always try direct addition first
 notebooklm source add "https://open-site.com/article" --notebook <id> --json
 # Monitor status: if FAILED, then use browser extraction

 # For anti-crawler URLs (fallback applied only if direct fails):
 # Twitter/X example
 notebooklm source add "https://x.com/user/status/123" --notebook <id> --json
 # If returns FAILED:
 skill_mcp playwright browser_navigate --arguments '{"url": "https://x.com/user/status/123"}'
 skill_mcp playwright browser_run_code --arguments '{...extract code...}'
 echo "# Extracted content" > /tmp/twitter_content.md
 notebooklm source add /tmp/twitter_content.md --notebook <id>

notebooklm source list to verify all sources status

Source limits: Max 50 sources per notebook Supported types: PDFs, YouTube URLs, web URLs, Google Docs, text files

Bulk Import with Source Waiting (Subagent Pattern)

Time: Varies by source count

When adding multiple sources and needing to wait for processing before chat/generation:

Add sources with --json to capture IDs:

notebooklm source add "https://url1.com" --json  # → {"source_id": "abc..."}
notebooklm source add "https://url2.com" --json  # → {"source_id": "def..."}

Spawn a background agent to wait for all sources:

Task(
  prompt="Wait for sources {source_ids} in notebook {notebook_id} to be ready.
          For each: notebooklm source wait {id} -n {notebook_id} --timeout 120
          Report when all ready or if any fail.",
  subagent_type="general-purpose"
)

Main conversation continues while agent waits
Once sources are ready, proceed with chat or generation

Why wait for sources? Sources must be indexed before chat or generation. Takes 10-60 seconds per source.

Deep Web Research (Subagent Pattern)

Time: 2-5 minutes, runs in background

Mixed Sources with Browser Extraction

Time: 5-15 minutes

When working with mixed sources (some direct, some anti-crawler):

Create notebook: notebooklm create "Mixed Collection"

Add direct sources normally:

notebooklm source add "https://open-site.com/article"
notebooklm source add ./local-document.pdf

Handle protected sources with browser extraction:

# Twitter/X
skill_mcp playwright browser_navigate --arguments '{"url": "https://x.com/user/status/123"}'
skill_mcp playwright browser_run_code --arguments '{"code": "async (page) => { \
  await page.waitForTimeout(5000); \
  return await page.evaluate(() => document.querySelector('article')?.innerText); \
}"}'
# Save output to /tmp/twitter_extracted.md

# LinkedIn
skill_mcp playwright browser_navigate --arguments '{"url": "https://linkedin.com/post/..."}'
skill_mcp playwright browser_run_code --arguments '{...code for LinkedIn...}'
# Save output to /tmp/linkedin_extracted.md

# Add extracted files
notebooklm source add /tmp/twitter_extracted.md --notebook <id>
notebooklm source add /tmp/linkedin_extracted.md --notebook <id>

Wait for all sources: notebooklm source list --json until all READY
Generate content as usual

Benefits:

Seamlessly handle both open and protected platforms
All content indexed in one notebook
Browser extraction runs transparently to user

Deep research finds and analyzes web sources on a topic:

Create notebook: notebooklm create "Research: [topic]"

Start deep research (non-blocking):

notebooklm source add-research "topic query" --mode deep --no-wait

Spawn a background agent to wait and import:

Task(
  prompt="Wait for research in notebook {notebook_id} to complete and import sources.
          Use: notebooklm research wait -n {notebook_id} --import-all --timeout 300
          Report how many sources were imported.",
  subagent_type="general-purpose"
)

Main conversation continues while agent waits
When agent completes, sources are imported automatically

Alternative (blocking): For simple cases, omit --no-wait:

notebooklm source add-research "topic" --mode deep --import-all
# Blocks for up to 5 minutes

When to use each mode:

--mode fast: Specific topic, quick overview needed (5-10 sources, seconds)
--mode deep: Broad topic, comprehensive analysis needed (20+ sources, 2-5 min)

Research sources:

--from web: Search the web (default)
--from drive: Search Google Drive

Output Style

Progress updates: Brief status for each step

"Creating notebook 'Research: AI'..."
"Adding source: https://example.com..."
"Starting audio generation... (task ID: abc123)"

Fire-and-forget for long operations:

Start generation, return artifact ID immediately
Do NOT poll or wait in main conversation - generation takes 5-45 minutes (see timing table)
User checks status manually, OR use subagent with artifact wait

JSON output: Use --json flag for machine-readable output:

notebooklm list --json
notebooklm source list --json
notebooklm artifact list --json

JSON schemas (key fields):

notebooklm list --json:

{"notebooks": [{"id": "...", "title": "...", "created_at": "..."}]}

notebooklm source list --json:

{"sources": [{"id": "...", "title": "...", "status": "READY|PROCESSING|FAILED"}]}

notebooklm artifact list --json:

{"artifacts": [{"id": "...", "title": "...", "type": "AUDIO_OVERVIEW", "status": "COMPLETED|PENDING|FAILED"}]}

Status values:

Sources: PROCESSING → READY (or FAILED)
Artifacts: PENDING → COMPLETED (or FAILED)

Error Handling

Progressive Fallback Strategy (Applied Automatically)

When regular source addition fails:

Error Type	Detection	Fallback Action	User Notification
Anti-crawler (Twitter/X)	URL contains `twitter.com`, `x.com`	Browser extraction via Playwright	"⚠️ Required browser extraction (anti-crawler)"
Login required	Auth errors from source add	Browser extraction (if possible) or manual copy	"⚠️ Login required - using browser extraction"
403/404 Access Denied	Network error	Browser extraction attempt	"⚠️ Access denied - trying browser method"
Timeout	Source processing timeout	Retry or browser extraction	"⏱️ Timeout - retrying..."

Success cases:

Method	Condition	User Notification
Direct URL addition	Most websites, no anti-crawler	"✓ Added via direct URL"
Browser extraction	Protected platforms only	"⚠️ Required browser extraction (anti-crawler)"

Browser-based extraction handling (fallback only): When encountering protected platforms (Twitter/X, LinkedIn, etc.):

Detection: Identify URL patterns: twitter.com, x.com, linkedin.com, facebook.com
Fallback method: Use Playwright browser automation instead of direct URL addition
Content extraction: Use JavaScript selectors or page.evaluate() to extract text
Save to file: Write extracted content to /tmp/ before adding to NotebookLM
NotebookLM integration: Add saved file as a source

Selector patterns for common platforms:

Platform	Primary selector	Fallback selector
Twitter/X	`article`, `[data-testid="tweetText"]`	`body.innerText`
LinkedIn	`article`, `.feed-shared-article`	`body.innerText`
Medium	`article`, `[data-post-id]`	`body.innerText`
Generic	`article`, `main`	`body.innerText`

On failure, offer the user a choice:

Retry the operation
Skip and continue with something else
Investigate the error

Error decision tree:

Error	Cause	Action
Auth/cookie error	Session expired	Run `notebooklm login`
"No notebook context"	Context not set	Use `-n <id>` or `--notebook <id>` flag (parallel), or `notebooklm use <id>` (single-agent)
"No result found for RPC ID"	Rate limiting	Wait 5-10 min, retry
`GENERATION_FAILED`	Google rate limit	Wait and retry later
Download fails	Generation incomplete	Check `artifact list` for status
Invalid notebook/source ID	Wrong ID	Run `notebooklm list` to verify
RPC protocol error	Google changed APIs	May need CLI update

Exit Codes

All commands use consistent exit codes:

Code	Meaning	Action
0	Success	Continue
1	Error (not found, processing failed)	Check stderr, see Error Handling
2	Timeout (wait commands only)	Extend timeout or check status manually

Examples:

source wait returns 1 if source not found or processing failed
artifact wait returns 2 if timeout reached before completion
generate returns 1 if rate limited (check stderr for details)

Known Limitations

Rate limiting: Audio, video, quiz, flashcards, infographic, and slides generation may fail due to Google's rate limits. This is an API limitation, not a bug.

Reliable operations: These always work:

Notebooks (list, create, delete, rename)
Sources (add, list, delete)
Chat/queries
Mind-map, study-guide, FAQ, data-table generation

Unreliable operations: These may fail with rate limiting:

Audio (podcast) generation
Video generation
Quiz and flashcard generation
Infographic and slides generation

Workaround: If generation fails:

Check status: notebooklm artifact list
Retry after 5-10 minutes
Use the NotebookLM web UI as fallback

Processing times vary significantly. Use the subagent pattern for long operations:

Operation	Typical time	Suggested timeout
Source processing	30s - 10 min	600s
Research (fast)	30s - 2 min	180s
Research (deep)	15 - 30+ min	1800s
Notes	instant	n/a
Mind-map	instant (sync)	n/a
Quiz, flashcards	5 - 15 min	900s
Report, data-table	5 - 15 min	900s
Audio generation	10 - 20 min	1200s
Video generation	15 - 45 min	2700s

Polling intervals: When checking status manually, poll every 15-30 seconds to avoid excessive API calls.

Troubleshooting

notebooklm --help              # Main commands
notebooklm notebook --help     # Notebook management
notebooklm source --help       # Source management
notebooklm research --help     # Research status/wait
notebooklm generate --help     # Content generation
notebooklm artifact --help     # Artifact management
notebooklm download --help     # Download content

Re-authenticate: notebooklm login Check version: notebooklm --version Update skill: notebooklm skill install

notebooklm