Multimedia Accessibility

Design multimedia content that can be consumed by people who can't hear the audio, can't see the video, or can't process both at once.

Core Principle

Every piece of information conveyed through one sense must also be available through another. Audio content needs a visual equivalent. Visual content needs an audio or text equivalent.

Captions

What Captions Include

All spoken dialogue, attributed to the speaker
Relevant sound effects: [door slams], [phone rings], [laughter]
Music that conveys mood or meaning: [upbeat music], [tense score]
Off-screen sounds that matter: [footsteps approaching]

Caption Quality

Synchronised with speech (within 1 second)
Minimum 1 second display time, maximum 2 lines
No more than 32 characters per line for readability
Proper punctuation and grammar — not auto-generated without review
Speaker identification when multiple speakers are present

Auto-Captions Are Not Enough

Auto-generated captions typically have 80–85% accuracy
That means 1 in 5 or 6 words is wrong
Always review and correct auto-captions
Names, technical terms, and accented speech fail most often

Transcripts

When to Provide

Every audio-only file (podcast, voice memo, audio article)
Every video (as a complement to captions)
Transcripts serve people who are deaf, hard of hearing, or who prefer reading to watching/listening

What Transcripts Include

All spoken content with speaker labels
Descriptions of relevant visual content (for video transcripts)
Relevant sound effects and music cues
Timestamps for longer content (helps navigation)

Transcript Placement

Link directly below or beside the media player
Label clearly: "Read the full transcript"
Don't hide behind multiple clicks
Make transcripts searchable

Audio Descriptions

When to Provide

When video content conveys important visual information that isn't in the dialogue
Presentations where speakers reference slides or visuals
Tutorials that demonstrate visual actions

What to Describe

On-screen text not read aloud
Actions and gestures that carry meaning
Scene changes and visual context
Charts, graphs, and visual data shown on screen

How to Describe

Fit descriptions into natural pauses in dialogue
Be concise — describe what matters, not everything visible
For dense visual content: provide an extended audio description version where the video pauses for description

Media Player Requirements

Keyboard-accessible play, pause, stop, volume, and seek controls
Visible captions toggle
Speed controls (0.5x to 2x) — essential for cognitive accessibility
Volume control independent of system volume
No autoplay — let the user choose when to start
Visible progress bar with time display

Assessment Questions

Does every video have accurate, synchronised captions?
Does every audio file have a transcript?
Are auto-captions reviewed and corrected?
Is important visual information audio-described?
Is the media player fully keyboard accessible?
Is autoplay disabled?

multimedia-accessibility

Multimedia Accessibility

Core Principle

Captions

What Captions Include

Caption Quality

Auto-Captions Are Not Enough

Transcripts

When to Provide

What Transcripts Include

Transcript Placement

Audio Descriptions

When to Provide

What to Describe

How to Describe

Media Player Requirements

Assessment Questions