skills/agentlyhq/skills/google-drive-meeting-transcriber

google-drive-meeting-transcriber

SKILL.md

Meeting Notes Transcriber

Transform unstructured transcript text files inside meeting subfolders into clean, executive-quality meeting notes, written as Google Docs in the root folder.

Only processes new meeting folders — ones that don't already have a corresponding Google Doc in the root folder.

What This Does

Input: Root Drive folder (e.g., "Meetings") containing subfolders, one per meeting Output: One Google Doc per meeting in the root folder, titled YYYYMMDD [meeting name] Skip: Meetings with only audio/video (no transcript .txt), or meetings already processed (doc exists)

Example Structure

📁 Meetings/ (ROOT_FOLDER_ID)
  ├── 📁 2026-02-26 12.26.41 Amy & Charlie/
  │   └── zoom_transcript.txt
  ├── 📁 2026-03-01 14.30.00 Team Sync/
  │   └── transcript.txt
  └── 📄 20260226 Amy & Charlie.gdoc   ← created by this skill (skip next run)
  └── 📄 20260301 Team Sync.gdoc       ← created by this skill (skip next run)

Prerequisites

# Check if gws is installed
gws --version

# If not installed
npm install -g @googleworkspace/cli

# Authenticate
gws auth login

# Verify access
gws drive files list

Usage

Basic Command

User: "Transcribe meetings in folder [folder name or ID]"

With Options

User: "Transcribe meetings in 'Team Meetings' folder, overwrite existing docs"
User: "Transcribe meetings in 'Team Meetings', dry run"

Input Parameters

Required

  • ROOT_FOLDER_ID: Drive folder ID for the "Meetings" root folder
    • User can provide folder name (you search for it)
    • User can provide folder ID directly
    • User can provide folder URL (extract ID from it)

Optional

  • OVERWRITE (default: false)

    • If false (default), skip any meeting folder that already has a matching doc in the root folder
    • If true, re-generate and overwrite the existing doc content
  • DRY_RUN (default: false)

    • If true, do everything except creating/writing Docs
    • Useful for previewing what would be processed

Processing Workflow

Step 1: Get ROOT_FOLDER_ID

If user provides folder name, search for it, then look for the folder name inside the name variable and get the id:

gws drive files list \
  --format json

If user provides URL like https://drive.google.com/drive/folders/ABC123, extract ABC123.

Step 2: List All Meeting Subfolders

gws drive files list \
  --query "'1Yh-tq1Jl479y9dPREVXtBk2ZNZ4SMhOO' in parents and mimeType = 'application/vnd.google-apps.folder' and trashed = false" \
  --format json > /tmp/meeting_folders.json

echo "Found meeting folders:"
jq -r '.files[] | "\(.name) (ID: \(.id))"' /tmp/meeting_folders.json

Step 3: Get Existing Docs in Root Folder (for "new only" detection)

Before processing any folder, fetch the list of Google Docs already in the root folder:

gws drive files list \
  --query "'${ROOT_FOLDER_ID}' in parents and mimeType = 'application/vnd.google-apps.document' and trashed = false" \
  --format json > /tmp/existing_docs.json

# Build a list of existing doc titles (normalized, lowercase for comparison)
EXISTING_TITLES=$(jq -r '.files[].name' /tmp/existing_docs.json | tr '[:upper:]' '[:lower:]')

echo "Existing docs already processed:"
jq -r '.files[].name' /tmp/existing_docs.json

Step 4: Process Each Meeting Folder

For each meeting folder:

4a. Derive Expected Doc Title

MEETING_FOLDER_NAME="..."

# Extract date (YYYY-MM-DD prefix)
if [[ "$MEETING_FOLDER_NAME" =~ ^([0-9]{4}-[0-9]{2}-[0-9]{2}) ]]; then
  MEETING_DATE="${BASH_REMATCH[1]}"
  MEETING_DATE_FORMATTED=$(date -j -f "%Y-%m-%d" "$MEETING_DATE" +"%Y%m%d" 2>/dev/null || date -d "$MEETING_DATE" +"%Y%m%d")
else
  MEETING_DATE_FORMATTED=$(date +"%Y%m%d")
  echo "NOTE: No date prefix found in '$MEETING_FOLDER_NAME', using generation date"
fi

# Strip date and optional timestamp prefix to get the meeting name
MEETING_NAME=$(echo "$MEETING_FOLDER_NAME" | sed -E 's/^[0-9]{4}-[0-9]{2}-[0-9]{2} ([0-9]{2}\.[0-9]{2}\.[0-9]{2} )?//')

DOC_TITLE="${MEETING_DATE_FORMATTED} ${MEETING_NAME}"

4b. Skip If Already Processed (OVERWRITE=false)

DOC_TITLE_LOWER=$(echo "$DOC_TITLE" | tr '[:upper:]' '[:lower:]')

if echo "$EXISTING_TITLES" | grep -qF "$DOC_TITLE_LOWER"; then
  if [ "$OVERWRITE" = "true" ]; then
    echo "OVERWRITE: Re-processing $MEETING_FOLDER_NAME"
    # Get existing doc ID for overwrite
    EXISTING_DOC_ID=$(jq -r --arg title "$DOC_TITLE" \
      '.files[] | select(.name == $title) | .id' /tmp/existing_docs.json)
  else
    echo "SKIPPED_ALREADY_PROCESSED: $MEETING_FOLDER_NAME (doc '$DOC_TITLE' exists)"
    continue
  fi
fi

4c. List Files in Meeting Folder

MEETING_FOLDER_ID="..."

gws drive files list \
  --query "'${MEETING_FOLDER_ID}' in parents and trashed = false" \
  --format json > /tmp/meeting_files_${MEETING_FOLDER_ID}.json

4d. Select Transcript Files

Filter for .txt files only, excluding audio/video:

TRANSCRIPT_FILES=$(jq -r '.files[] |
  select(.mimeType == "text/plain" or (.name | endswith(".txt"))) |
  select(.mimeType | (startswith("video/") or startswith("audio/")) | not) |
  "\(.id)|\(.name)"' /tmp/meeting_files_${MEETING_FOLDER_ID}.json)

TRANSCRIPT_COUNT=$(echo "$TRANSCRIPT_FILES" | grep -c . || echo 0)

if [ "$TRANSCRIPT_COUNT" -eq 0 ]; then
  echo "SKIPPED_NO_TRANSCRIPT: $MEETING_FOLDER_NAME"
  continue
fi

4e. Download and Merge Transcripts

mkdir -p /tmp/transcripts/${MEETING_FOLDER_ID}

echo "$TRANSCRIPT_FILES" | while IFS='|' read FILE_ID FILE_NAME; do
  echo "Downloading: $FILE_NAME"
  gws drive files export \
    --file-id "$FILE_ID" \
    --mime-type "text/plain" \
    --output-file "/tmp/transcripts/${MEETING_FOLDER_ID}/${FILE_NAME}"
done

# Merge all transcripts in alphabetical order
# If multiple files, separate them clearly
MERGED_TRANSCRIPT=""
FILE_COUNT=0
for file in $(ls /tmp/transcripts/${MEETING_FOLDER_ID}/*.txt 2>/dev/null | sort); do
  FILE_COUNT=$((FILE_COUNT + 1))
  if [ $FILE_COUNT -gt 1 ]; then
    FILENAME=$(basename "$file")
    MERGED_TRANSCRIPT="${MERGED_TRANSCRIPT}\n\n--- CONTINUED: ${FILENAME} ---\n\n"
  fi
  MERGED_TRANSCRIPT="${MERGED_TRANSCRIPT}$(cat "$file")"
done

echo -e "$MERGED_TRANSCRIPT" > /tmp/merged_transcript_${MEETING_FOLDER_ID}.txt

CHAR_COUNT=$(wc -c < /tmp/merged_transcript_${MEETING_FOLDER_ID}.txt)
if [ "$CHAR_COUNT" -lt 100 ]; then
  echo "SKIPPED_EMPTY_TRANSCRIPT: $MEETING_FOLDER_NAME (only $CHAR_COUNT chars)"
  continue
fi

4f. Generate Meeting Notes

Read the full merged transcript and generate structured notes. Claude must produce output in exactly this format — no additions, no reordering, no section renaming:

MMM D, YYYY | <Host> <> <Counterparty> - <Firm/Context>
Attendees: <Name1>  <Name2>  <Name3>

**Meeting Purpose**
<One-line summary of what this meeting is about.>

**Key Takeaways**
- <Key insight, decision, or conclusion — max 5 points>
- ...

**Discussion**
<Topic Name>
- <Key point attributed to speaker if possible>
- <Decisions, concerns, risks raised>

<Another Topic>
- ...

**Next Steps & Action Items**
- **<Owner>**
  - <Verb + task + context> — Due: <d MMM yy, rough quarter, or leave blank>
- Or, none captured explicitly.

Strict rules for Claude:

  • Use title case, not ALL CAPS, for all section headings
  • Use bullet points, not numbered lists
  • Do NOT invent facts. If uncertain, keep it general or omit
  • Extract attendees from speaker labels in transcript (e.g. "John:", "Speaker 1:")
  • The header line (MMM D, YYYY | ...) uses the meeting date from the folder name, not today's date
  • Summarize — do NOT transcribe verbatim
  • Always include Next Steps & Action Items (even if "none captured explicitly")
  • Always include Discussion with topics grouped into sub-sections

4g. Create or Overwrite Google Doc

Important: Create the doc directly inside the root folder to avoid a separate move step (which can create duplicate entries).

if [ "$DRY_RUN" = "false" ]; then
  if [ "$OVERWRITE" = "true" ] && [ -n "$EXISTING_DOC_ID" ]; then
    # Overwrite: clear existing doc content and rewrite
    DOC_ID="$EXISTING_DOC_ID"
    echo "Overwriting existing doc: $DOC_TITLE (ID: $DOC_ID)"
    # Clear content first, then write
    gws docs documents batchUpdate --document-id "$DOC_ID" \
      --json '{"requests":[{"deleteContentRange":{"range":{"startIndex":1,"endIndex":99999}}}]}'
  else
    # Create new doc — specify parent folder at creation time to avoid duplicate placement
    DOC_JSON=$(gws docs documents create \
      --json "{\"title\":\"${DOC_TITLE}\"}" \
      --parents "${ROOT_FOLDER_ID}")
    DOC_ID=$(echo "$DOC_JSON" | jq -r '.documentId')
    echo "Created doc: $DOC_TITLE (ID: $DOC_ID)"
  fi

  # Write meeting notes content to the doc
  gws docs documents batchUpdate --document-id "$DOC_ID" \
    --json "{\"requests\":[{\"insertText\":{\"location\":{\"index\":1},\"text\":$(echo "$NOTES_CONTENT" | jq -Rs .)}}]}"

  echo "PROCESSED: $MEETING_FOLDER_NAME$DOC_TITLE (ID: $DOC_ID)"
else
  echo "DRY_RUN: Would process $MEETING_FOLDER_NAME$DOC_TITLE"
fi

Note on --parents flag: If your version of gws does not support --parents on docs documents create, fall back to creating the doc then immediately moving it:

DOC_JSON=$(gws docs documents create --json "{\"title\":\"${DOC_TITLE}\"}")
DOC_ID=$(echo "$DOC_JSON" | jq -r '.documentId')
CURRENT_PARENTS=$(gws drive files get --file-id "$DOC_ID" --fields 'parents' | jq -r '.parents | join(",")')
gws drive files update \
  --file-id "$DOC_ID" \
  --add-parents "${ROOT_FOLDER_ID}" \
  --remove-parents "${CURRENT_PARENTS}"

Verify the move succeeded before writing content. If the file still appears in the original location, it means removeParents did not execute correctly — retry the update.

Step 5: Summary Report

echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "TRANSCRIPTION COMPLETE"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Total folders scanned:         $TOTAL_FOLDERS"
echo "Already processed (skipped):   $SKIPPED_EXISTING_COUNT"
echo "Processed (new):               $PROCESSED_COUNT"
echo "Skipped (no transcript):       $SKIPPED_NO_TRANSCRIPT_COUNT"
echo "Skipped (empty transcript):    $SKIPPED_EMPTY_COUNT"
echo "Errors:                        $ERROR_COUNT"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"

Skip Rules

Files to Ignore (Not Transcripts)

Video formats: .mp4, .mov, .mkv, .webm, .avi — or mimeType starting with video/ Audio formats: .m4a, .mp3, .wav, .aac, .flac, .ogg — or mimeType starting with audio/

When to Skip a Meeting Folder

  • SKIPPED_ALREADY_PROCESSED: A Google Doc with the expected title already exists in the root folder (and OVERWRITE=false)
  • SKIPPED_NO_TRANSCRIPT: No .txt files found in the meeting subfolder
  • SKIPPED_EMPTY_TRANSCRIPT: .txt found but total content < 100 characters

Title & Naming Rules

Date Format: YYYYMMDD

Parse date from folder name:

  1. If folder starts with YYYY-MM-DD, use that date → convert to YYYYMMDD
  2. If no date prefix: try to infer from transcript content; fallback to today's date and add note

Meeting Name

Strip these prefixes from folder name:

  • YYYY-MM-DD
  • YYYY-MM-DD HH.MM.SS
  • YYYY-MM-DD HH:MM:SS

Remaining text becomes the meeting name.

Doc Title Format

{YYYYMMDD} {meeting name}

Examples:

  • Folder: 2026-02-26 12.26.41 Amy & Charlie → Doc: 20260226 Amy & Charlie
  • Folder: 2026-03-01 14.30.00 Team Sync → Doc: 20260301 Team Sync

Meeting Notes Format (Strict)

The generated content must follow this exact layout — one Google Doc per meeting, one output file only:

MMM D, YYYY | <Host> <> <Counterparty> - <Firm/Context>
Attendees: <Name1>  <Name2>  <Name3>

**Meeting Purpose**
<One-line summary.>

**Key Takeaways**
- <Point — max 5>
- ...

**Discussion**
<Topic>
- <Who said what, decisions, concerns>

<Another Topic>
- ...

**Next Steps & Action Items**
- **<Owner>**
  - <Verb + task + context> — Due: <d MMM yy or blank>
- Or, none captured explicitly.

One doc per meeting. No separate summary file, no duplicate.


Error Handling

Logging Per Folder

Emit exactly one of:

  • PROCESSED — meeting folder name, DOC_ID of created/updated doc
  • SKIPPED_ALREADY_PROCESSED — folder already has a matching doc
  • SKIPPED_NO_TRANSCRIPT — folder has no .txt
  • SKIPPED_EMPTY_TRANSCRIPT.txt found but unusable
  • ERROR_GENERATION — notes generation failed
  • ERROR_DOC_CREATE — doc creation failed
  • ERROR_DOC_WRITE — doc writing failed (include DOC_ID for cleanup)

Data Safety

  • Never move, rename, or delete files in meeting subfolders
  • Never modify original transcript files
  • Only create/update docs in root folder
  • If any step fails, skip that meeting and continue to next
  • Never produce more than one output file per meeting

Complete Example

Input Structure

📁 Meetings/ (ROOT_FOLDER_ID: abc123)
  ├── 📁 2026-02-26 12.26.41 Amy & Charlie/   ← new, no doc yet
  │   ├── zoom_transcript.txt
  │   └── recording.mp4 (ignored)
  └── 📄 20260301 Team Sync.gdoc              ← already processed, skip

Processing Steps

# 1. List existing docs in root → find "20260301 Team Sync" already exists
# 2. List meeting subfolders → find "2026-02-26 12.26.41 Amy & Charlie"
# 3. Derive expected title → "20260226 Amy & Charlie" → not in existing docs → process it

# 4. List files in subfolder
$ gws drive files list --query "'folder_id' in parents"
# Found: zoom_transcript.txt (text/plain) ✓
# Found: recording.mp4 (ignored — video) ✗

# 5. Download transcript
$ gws drive files export --file-id "transcript_id" --mime-type "text/plain" --output-file "/tmp/transcript.txt"

# 6. Generate notes (Claude analyzes full transcript content)

# 7. Create Google Doc directly in root folder
$ gws docs documents create --json '{"title":"20260226 Amy & Charlie"}' --parents "abc123"
# Returns: {"documentId": "doc456"}

# 8. Write content to doc
$ gws docs documents batchUpdate --document-id "doc456" --json '...'

✓ PROCESSED: 2026-02-26 12.26.41 Amy & Charlie → 20260226 Amy & Charlie (ID: doc456)
✓ SKIPPED_ALREADY_PROCESSED: 20260301 Team Sync (doc exists)

Output Google Doc

Title: 20260226 Amy & Charlie

Content (one doc, exactly this format):

Feb 26, 2026 | Amy <> Charlie - Initial Discussion
Attendees: Amy  Charlie

**Meeting Purpose**
Initial conversation about potential collaboration on Q2 project.

**Key Takeaways**
- Both parties interested in moving forward with a partnership
- Budget range: $50K–$100K for Q2
- Timeline: decision by mid-March
- Legal teams need to be involved for contract review
- Technical feasibility confirmed

**Discussion**

Product Scope
- Charlie outlined current platform capabilities and integration options
- Amy shared integration requirements; both agreed on a phased MVP approach

Budget & Timeline
- Amy: budget allocated for Q2, prefers April start
- Charlie: team availability aligns with April; milestone-based payment preferred

Technical Considerations
- API integration estimated at 2–3 weeks
- Dedicated staging environment required
- Charlie to provide technical documentation post-call

**Next Steps & Action Items**
- **Charlie**
  - Send technical documentation and API specs — Due: 5 Mar 26
  - Provide cost breakdown for phased approach — Due: 8 Mar 26
- **Amy**
  - Review documentation with internal team — Due: 12 Mar 26
  - Schedule legal review meeting — Due: mid-March
  - Confirm Q2 budget allocation — Due: 15 Mar 26

CLI Command Reference

# List folders/files
gws drive files list --query "<query>" --format json

# Search for files/folders
gws drive files search --query "<query>" --format json

# Get file metadata
gws drive files get --file-id "<id>" --fields 'parents'

# Export file content (transcripts)
gws drive files export --file-id "<id>" --mime-type "text/plain" --output-file "<path>"

# Move file to folder (update parents)
gws drive files update --file-id "<id>" --add-parents "<folder-id>" --remove-parents "<old-parent-id>"

# Create Google Doc (prefer with --parents to avoid duplicate placement)
gws docs documents create --json '{"title":"<title>"}' --parents "<folder-id>"

# Write to Google Doc via batchUpdate
gws docs documents batchUpdate --document-id "<id>" --json '<requests json>'

# Check schema for exact flags for your gws version
gws schema drive.files.list
gws schema docs.documents.create
gws schema docs.documents.batchUpdate

Troubleshooting

"gws command not found"

npm install -g @googleworkspace/cli
gws auth login

Two docs appearing per meeting

  • The --parents flag on docs documents create may not be supported by your gws version
  • Use the fallback move approach (Step 4g) and verify removeParents executed correctly
  • Check: gws drive files get --file-id "<doc-id>" --fields 'parents' — should show only one parent

"No transcripts found" but files exist

  • Files must be .txt with mimeType = text/plain
  • Check with: gws drive files list --query "'<folder_id>' in parents" --format json | jq '.files[] | {name, mimeType}'

All folders show as "already processed"

  • This is correct behavior on re-runs — only new folders without matching docs are processed
  • Use OVERWRITE=true to force re-processing all meetings

Doc creation works but writing fails

  • Check gws schema docs.documents.batchUpdate for supported request types
  • The insertText request requires content starting at index 1 (not 0)

Minimal Required Capabilities

  • Drive: list files/folders, get file metadata, export file content, update file parents
  • Docs: create document, batchUpdate (insertText, deleteContentRange)
gws schema drive.files.list
gws schema drive.files.get
gws schema drive.files.export
gws schema drive.files.update
gws schema docs.documents.create
gws schema docs.documents.batchUpdate
Weekly Installs
1
GitHub Stars
2
First Seen
4 days ago
Installed on
amp1
cline1
openclaw1
opencode1
cursor1
kimi-cli1