subtitle-correction
Subtitle Correction Skill
This skill corrects speech recognition errors in subtitle files while strictly preserving timeline information.
Interactive Workflow
Step 1: Request Terminology from User
IMPORTANT: Before starting any correction, ALWAYS ask the user for domain-specific terms.
Prompt the user with:
在开始修正之前,请提供一些关键术语,帮助我更准确地识别和修正语音识别错误:
1. **专有名词**:人名、品牌名、产品名等(如:Anthropic、Claude)
2. **技术术语**:框架、库、工具名称(如:LangChain、OpenAI、PyTorch)
3. **领域词汇**:行业特定词汇(如:checkpointer、middleware、runtime)
4. **其他关键词**:视频中频繁出现的重要词汇
请用逗号分隔,例如:`LangChain, Agent, OpenAI, checkpointer`
For English users:
Before I begin correction, please provide key terms to help me accurately identify speech recognition errors:
1. **Proper nouns**: Names, brands, products (e.g., Anthropic, Claude)
2. **Technical terms**: Frameworks, libraries, tools (e.g., LangChain, PyTorch)
3. **Domain vocabulary**: Industry-specific terms (e.g., checkpointer, middleware)
4. **Other keywords**: Important words that appear frequently
Please separate with commas, e.g.: `LangChain, Agent, OpenAI, checkpointer`
Step 2: Confirm Understanding
After receiving terms, confirm by:
- Listing the terms received
- Identifying the likely domain/context (AI/ML tutorial, web dev, etc.)
- Asking if there are any additional terms before proceeding
Example response:
收到以下术语:
- 技术框架:LangChain, LangGraph, OpenAI
- 技术概念:Agent, checkpointer, runtime
看起来这是一个 LangChain 智能体开发的教程视频。
还有其他需要补充的术语吗?如果没有,我将开始修正字幕。
Step 3: Process with Terms
Use the provided terms to:
- Build a mental model of expected vocabulary
- Identify likely speech recognition errors
- Apply consistent corrections throughout
When User Doesn't Provide Terms
If user says "没有" / "no" / "直接开始":
- Proceed with correction using built-in patterns
- Flag uncertain corrections for user review
- After completion, ask if any terms were missed
Core Workflow
- Read the subtitle file - Load the .srt file provided by the user
- Identify error patterns - Recognize common speech recognition mistakes
- Apply corrections - Fix errors while preserving timestamps exactly
- Output corrected file - Return or save based on user's context
Strict Rules
Timeline Preservation
- NEVER modify timestamps - Keep all
00:00:00,000 --> 00:00:00,000lines exactly as-is - NEVER change subtitle numbering - Preserve sequence numbers
- NEVER merge or split subtitle entries - One-to-one correspondence
Error Categories
1. Phonetic Errors (同音字/谐音错误)
Common in Chinese speech recognition:
- 会话 ↔ 绘画 (huìhuà)
- 元数据 ↔ 源数据 (yuán shùjù)
- 本课 ↔ 本科 (běnkè)
- 示例 ↔ 事例 (shìlì)
- 实践 ↔ 时间 (shíjiàn)
2. Technical Term Errors
Speech recognition often fails on:
- Framework names: LangChain, LangGraph, OpenAI, PyTorch, TensorFlow
- Programming terms: API, SDK, runtime, checkpointer, middleware
- Code identifiers: snake_case names, function names, class names
3. English-Chinese Mixed Content
- Luncheon/lunch → langchain
- open EI/open Email → OpenAI
- land GRAPH → langgraph
- a memory Server → MemorySaver
4. Code-Related Terms
Convert spoken descriptions to proper format:
- "underscore" → "_" in variable names
- "dot" → "." in method calls
- Recognize camelCase, snake_case, PascalCase patterns
User-Provided Terminology
When users provide a terminology list, use it as the primary reference for corrections:
用户提供的术语:LangChain,Agent,OpenAI,LangGraph
These terms indicate:
- Expected proper spellings of technical terms
- Context about the content domain
- Hints for identifying speech recognition errors
Processing Strategy
For Long Files (>200 lines)
- Process in chunks using
view_rangeparameter - Maintain context across chunks
- Build complete corrected file incrementally
For Technical Content
- Identify the domain (AI/ML, web dev, etc.)
- Build mental model of expected terminology
- Apply domain-specific corrections consistently
Quality Checks
Before outputting:
- Verify all timestamps unchanged
- Verify subtitle count unchanged
- Check terminology consistency throughout
- Ensure no orphaned corrections (partial fixes)
Common Correction Patterns
Chinese AI/ML Course Content
| Error | Correction | Context |
|---|---|---|
| 蓝犬/蓝卷/Lantern | LangChain | Framework name |
| 绘画 | 会话 | Session/conversation |
| 拖/tour | tool | Tool concept |
| checkpoint组件 | checkpointer组件 | Memory component |
| 源数据 | 元数据 | Metadata |
| 大约模型 | 大模型 | Large model |
| 中间键 | 中间件 | Middleware |
Code Identifiers
| Spoken | Written |
|---|---|
| user underscore 001 | user_001 |
| thread underscore id | thread_id |
| create underscore agent | create_agent |
| runtime dot state | runtime.state |
Output Format
When saving, use -corrected suffix:
- Input:
filename.srt - Output:
filename-corrected.srt
Validation Script
Use scripts/subtitle_tool.py to validate and analyze subtitle files:
# Validate corrected file preserves structure
python scripts/subtitle_tool.py validate original.srt corrected.srt
# Show word-level diff with colored output (default, changes only)
python scripts/subtitle_tool.py diff original.srt corrected.srt
# Show ALL entries (changed and unchanged) in terminal
python scripts/subtitle_tool.py diff original.srt corrected.srt --all
# Generate HTML diff report (recommended for review)
python scripts/subtitle_tool.py diff original.srt corrected.srt --html report.html
# Show simple line-based diff (original/corrected lines)
python scripts/subtitle_tool.py diff original.srt corrected.srt --simple
# Disable colors for piping to files
python scripts/subtitle_tool.py diff original.srt corrected.srt --no-color
# Analyze file for potential speech recognition errors
python scripts/subtitle_tool.py analyze input.srt --terms "LangChain,OpenAI"
Diff Output Formats
Terminal Output (Default)
Shows word-level changes with colors:
[1] 00:00:01,500 --> 00:00:04,500
今天我们来学习[-Luncheon-]{+LangChain+}框架
[3] 00:00:08,000 --> 00:00:12,500
我们可以用它来创建[-绘画-]{+会话+}应用程序
[-deleted-]- Text removed (shown in red in terminal){+added+}- Text added (shown in green in terminal)- Use
--allto show unchanged entries too
HTML Report (Recommended for Review)
Generates a full comparison report with:
- All entries displayed (changed and unchanged)
- Side-by-side columns: Original | Corrected | Diff View
- Visual highlighting: Red strikethrough for deletions, green for additions
- Summary statistics: Total entries, changed count, unchanged count
- Jump links: Quick navigation to changed entries
- Change indicators: Yellow dot marks changed rows
python scripts/subtitle_tool.py diff original.srt corrected.srt --html diff_report.html
Then open diff_report.html in a browser.
Validation Workflow
- Before correction: Run
analyzeto identify potential errors - After correction: Run
validateto ensure structural integrity - Review changes: Run
diff --html report.htmlto generate a visual comparison report
Validation Checks
- Entry count matches original
- All timestamps unchanged
- All sequence numbers preserved
- No formatting corruption
Example Interaction
Interaction 1: User uploads file without terms
User: 帮我修正这个字幕文件
Claude:
在开始修正之前,请提供一些关键术语,帮助我更准确地识别语音识别错误:
1. **专有名词**:人名、品牌名、产品名
2. **技术术语**:框架、库、工具名称
3. **领域词汇**:行业特定词汇
4. **其他关键词**:视频中频繁出现的重要词汇
请用逗号分隔,例如:`LangChain, Agent, OpenAI`
User: LangChain, Agent, OpenAI, LangGraph, checkpointer
Claude:
收到以下术语:
- 技术框架:LangChain, LangGraph, OpenAI
- 技术概念:Agent, checkpointer
这是一个 LangChain 相关的技术教程。还有补充吗?
User: 没有了,开始吧
Claude: [Proceeds with correction workflow]
Interaction 2: User provides terms upfront
User: 帮我修正字幕,术语:LangChain, OpenAI, Agent
Claude: [Confirms terms, then proceeds directly]
Correction Process
- Read uploaded .srt file
- Run
analyzeto identify potential errors - Apply corrections using provided terms as primary reference
- Run
validateto confirm structural integrity - Save corrected file with
-correctedsuffix - Generate diff report and present summary of changes
- Offer HTML report: Ask user if they want an HTML diff report for easier review
Output: Provide categorized summary of corrections made.
After completion, prompt user:
修正完成!我可以生成一个 HTML 差异报告,方便您在浏览器中查看所有修改。
需要生成 HTML 报告吗?
Correction complete! I can generate an HTML diff report for easier review in your browser.
Would you like me to generate the HTML report?