media-accessibility
Media Accessibility Skill
Reference data for video, audio, and streaming media accessibility. Used by media-accessibility agent and any web agent encountering <video> or <audio> elements.
WCAG 1.2 Time-Based Media Criteria
| SC | Level | Requirement | Test |
|---|---|---|---|
| 1.2.1 | A | Audio-only/video-only: provide text transcript or audio track | Transcript exists alongside media |
| 1.2.2 | A | Captions for prerecorded audio in video | Synchronized captions present |
| 1.2.3 | A | Audio description OR text alternative for prerecorded video | Description track or full transcript |
| 1.2.4 | AA | Captions for live audio in video | Real-time captioning active |
| 1.2.5 | AA | Audio description for prerecorded video | Description track present |
| 1.2.6 | AAA | Sign language for prerecorded audio | Sign language interpretation video |
| 1.2.7 | AAA | Extended audio description | Paused video for long descriptions |
| 1.2.8 | AAA | Media alternative for prerecorded video | Full text alternative document |
| 1.2.9 | AAA | Audio-only (live): text alternative | Real-time text transcript |
Caption File Formats
WebVTT (Web Video Text Tracks)
WEBVTT
00:00:01.000 --> 00:00:04.000
Welcome to the accessibility course.
00:00:04.500 --> 00:00:08.000
Today we'll cover caption best practices.
00:00:08.500 --> 00:00:12.000
<v Speaker 2>Let's start with file formats.
Key rules:
- File must start with
WEBVTTheader - Timestamps:
HH:MM:SS.mmm --> HH:MM:SS.mmm - Speaker identification:
<v Speaker Name>Text - Maximum 2 lines per caption, 32 characters per line
- Minimum display time: 1 second
- Maximum display time: 7 seconds
SRT (SubRip Text)
1
00:00:01,000 --> 00:00:04,000
Welcome to the accessibility course.
2
00:00:04,500 --> 00:00:08,000
Today we'll cover caption best practices.
Key rules:
- Sequential numbering starting at 1
- Timestamps use comma for milliseconds (not period)
- Blank line between entries
- No styling support (unlike WebVTT)
TTML (Timed Text Markup Language)
- XML-based, used in broadcast and streaming
- Supports styling, positioning, and region layout
- IMSC (Internet Media Subtitles and Captions) is the web profile
Caption Quality Guidelines
| Metric | Requirement |
|---|---|
| Accuracy | 99%+ word accuracy |
| Synchronization | Within 1 second of audio |
| Speaker identification | Required when 2+ speakers |
| Sound effects | Describe in brackets: [applause], [phone rings] |
| Music | Describe: [upbeat music], [dramatic orchestral score] |
| Caption rate | Maximum 200 words per minute (3 words/second) |
| Line length | Maximum 32 characters per line |
| Lines per caption | Maximum 2 lines |
| Placement | Bottom center default; move for on-screen text |
Audio Description
Audio description narrates visual information during pauses in dialogue.
Requirements:
- Describe: actions, scene changes, on-screen text, facial expressions relevant to plot
- Don't describe: obvious audio cues, subjective interpretations
- Timing: fit into natural pauses; for extended descriptions (1.2.7), video pauses automatically
- Voice: distinct from program audio, clear and neutral
HTML implementation:
<video controls>
<source src="video.mp4" type="video/mp4">
<track kind="captions" src="captions-en.vtt" srclang="en" label="English" default>
<track kind="descriptions" src="descriptions-en.vtt" srclang="en" label="English Audio Descriptions">
</video>
Accessible Media Player ARIA Patterns
Minimum Controls
Every media player must have keyboard-accessible controls for:
- Play/Pause (
role="button",aria-label="Play"/aria-label="Pause") - Volume (
role="slider",aria-label="Volume",aria-valuemin,aria-valuemax,aria-valuenow) - Seek/Progress (
role="slider",aria-label="Seek",aria-valuetext="2 minutes 30 seconds") - Captions toggle (
role="button",aria-pressed,aria-label="Toggle captions") - Fullscreen (
role="button",aria-label="Enter fullscreen"/aria-label="Exit fullscreen")
Live Region Announcements
<div aria-live="polite" class="sr-only" id="player-status">
<!-- Announce: "Playing", "Paused", "Video ended", "Captions on", "Captions off" -->
</div>
Keyboard Shortcuts (Media Player Convention)
| Key | Action |
|---|---|
| Space / Enter | Play/Pause |
| Left Arrow | Rewind 5 seconds |
| Right Arrow | Forward 5 seconds |
| Up Arrow | Volume up |
| Down Arrow | Volume down |
| M | Mute/Unmute |
| C | Toggle captions |
| F | Toggle fullscreen |
Transcript Best Practices
- Provide a full-text transcript adjacent to or linked from the media
- Include speaker identification, timestamps (optional but helpful), and non-speech audio
- Transcripts benefit deaf users, search engines, and users who prefer reading
- Interactive transcripts (click a line to jump to that point) are excellent UX
Live Captioning
- WCAG 1.2.4 (AA) requires captions for live audio in synchronized media
- Options: human captioner (CART), AI-powered auto-captioning, hybrid
- AI auto-captions alone rarely meet the 99% accuracy requirement — human review or correction is recommended
- WebSocket-based caption delivery for web: push text to a
<div aria-live="polite">container
Common Violations
| Issue | WCAG SC | Severity |
|---|---|---|
| No captions on prerecorded video | 1.2.2 | Critical |
| Auto-generated captions without review | 1.2.2 | Serious |
| No audio description for visual-only content | 1.2.5 | Serious |
| Inaccessible media player controls | 2.1.1, 4.1.2 | Critical |
| No keyboard access to play/pause | 2.1.1 | Critical |
| Autoplay without user control | 1.4.2 | Serious |
| Volume slider not keyboard-operable | 2.1.1 | Serious |
| No transcript for audio-only content | 1.2.1 | Critical |
| Captions poorly synchronized (>2s drift) | 1.2.2 | Moderate |
| Missing speaker identification in captions | 1.2.2 | Moderate |
More from taylorarndt/a11y-agent-team
framework-accessibility
Framework-specific accessibility patterns and fix templates for React, Vue, Angular, Svelte, Next.js, and Tailwind CSS.
28document-scanning
Document discovery, inventory building, and metadata extraction for accessibility audits. Use when scanning folders for Office documents (.docx, .xlsx, .pptx) and PDFs, building file inventories, detecting changes via git diff, or extracting document properties like title, author, and language.
25github-analytics-scoring
Scoring formulas and analytical frameworks for GitHub workflow agents. Covers repository health scoring (0-100, A-F grades), priority scoring for issues/PRs/discussions, confidence levels for analytics findings, delta tracking (Fixed/New/Persistent/Regressed), velocity metrics, contributor metrics, bottleneck detection, and trend classification. Use when computing scores, tracking remediation progress, building prioritized dashboards, or detecting workflow bottlenecks.
25github-scanning
GitHub data collection patterns for workflow agents. Covers search query construction by intent, date range handling, repository scope narrowing, preferences.md integration, cross-repo intelligence, parallel stream collection model, and auto-recovery for empty results. Use when building agents that search GitHub for issues, PRs, discussions, releases, security alerts, or CI status.
22accessibility-rules
Cross-format document accessibility rule reference with WCAG 2.2 mapping. Use when looking up accessibility rules for Word (DOCX-*), Excel (XLSX-*), PowerPoint (PPTX-*), or PDF (PDFUA.*, PDFBP.*, PDFQ.*) documents, or when mapping findings to WCAG success criteria for compliance reporting.
21github-workflow-standards
Core standards for all GitHub workflow agents. Covers authentication, smart defaults, repository discovery, dual MD+HTML output, screen-reader-compliant HTML accessibility standards, safety rules, progress announcements, parallel execution, and output quality. Apply when building any GitHub workflow agent - issues, PRs, briefings, analytics, community reports, team management.
20