transcribe-audio
Audio Transcription
Transcribe audio files to markdown and support post-processing (Q&A, action items, summaries).
Workflow
1. Identify Audio Files
Find audio files matching the user's request:
- Single file: user specifies path directly
- Batch:
find <dir> -maxdepth 1 -type f \( -name "*.mp3" -o -name "*.wav" -o -name "*.ogg" -o -name "*.m4a" -o -name "*.flac" -o -name "*.webm" \) | sort
2. Check for Existing Transcripts
For each audio file, check if a sibling .md file exists (e.g. meeting.mp3 → meeting.md):
- Exists + user wants transcription: Ask whether to re-transcribe or use existing
- Exists + user wants analysis: Read the existing
.mddirectly — no need to transcribe - Does not exist: Proceed with transcription
3. Transcribe
Run the script for each file:
./scripts/transcribe.sh <audio-file> [custom-prompt] > <output.md>
- Output file: same name as audio, with
.mdextension, same directory - Default prompt handles speaker identification, timestamps, summary, action items
- Pass a custom prompt as second argument when the user requests different output or a focused transcription (see below)
The script outputs the transcript to stdout and progress to stderr. Capture stdout to the .md file.
Focused Transcription
When the user asks about a specific topic (e.g. "tell me about the Miro discussion", "what was said about budgets?"), pass a focused prompt as the second argument instead of doing a full transcription and then grep/reading:
./scripts/transcribe.sh <audio-file> "Focus on the parts of this audio that discuss <TOPIC>. Provide:
1. A detailed transcript of just those sections (with speaker labels and timestamps)
2. A summary of what was said about <TOPIC>
3. Any decisions, action items, or open questions related to <TOPIC>
Skip unrelated parts of the audio." > <output-focus.md>
- Output file for focused transcripts: use a suffix to avoid overwriting the full transcript, e.g.
meeting.focus-miro.md - When to use: The user asks about a specific topic AND there is no existing full transcript to search, OR the user explicitly asks to re-transcribe with a focus
- When NOT to use: A full transcript already exists — just read it and answer the question directly
4. Post-Processing
After transcription (or when an existing transcript is available), support any follow-up:
- Read the
.mdfile and answer questions about the content - Extract action items or TODOs
- Provide additional summaries or analysis
- Compare across multiple transcripts
Key Details
- Supported formats:
.mp3,.wav,.ogg,.m4a,.flac,.webm - API: Gemini via Portkey (key from
pass api/portkey-claude) - Timeout: 600s per file — long recordings take time
- Max file size: 200MB per file
Script Execution: Scripts should be executed from the skill directory. All scripts use Nix shebangs so no manual dependency installation is required.
More from markus1189/nixos-config
sourcegraph-search
Search code using Sourcegraph CLI. Use when (re)searching codebases, finding implementation examples, analyzing code patterns
22elfeed
Search the user's Emacs elfeed RSS feed database containing curated feeds from Reddit, blogs, YouTube, GitHub releases, and newsletters. Use when the user asks about articles they've read, mentions RSS feeds or 'something I read', wants to research topics from their curated sources (programming, AI, security, NixOS, Emacs, etc.), or needs to triage unread items.
1telegram
Direct integration with Telegram Bot API for checking inbox messages and sending messages (text, photos, documents) to chats. Triggers when users request 'check Telegram inbox', 'check messages', 'send message to [chat]', or mention Telegram bot interactions. Requires TELEGRAM_BOT_TOKEN environment variable.
1agent-browser
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
1nzb-search
Search and download NZB files from Usenet indexers (SceneNZBs, NZBgeek, NZBFinder, NZBPlanet) for movies, TV shows, books, and other media. Use when the user wants to find or download content from Usenet, mentions NZBs, asks for movies/TV/books with download intent, or wants to manage their cart.
1hackernews
Daily HN briefing agent with delta tracking, deep-dive sub-agent pipeline, and user-interest-aware filtering. Use when the user asks about HN, Hacker News, tech news, wants to check/browse HN, get a briefing, see what's new or hot, search stories, read or analyze HN comments, or do a deep dive on a story.
1