skills/yonatangross/orchestkit/audio-mixing-patterns

audio-mixing-patterns

SKILL.md

Audio Mixing Patterns

Comprehensive guide to audio mixing for video production using ffmpeg. Covers narration/music balancing, automatic ducking, timing control, and loudness normalization.

Core Principle

Quality Audio = Clear Narration + Supportive Music + Appropriate Levels

The human voice occupies 85-255 Hz (fundamental) with harmonics up to 8kHz. Music must support, not compete.

Volume Balancing Formula

Standard Video Mix Ratios:
--------------------------
Narration:  100% (reference level)
Music:      15-20% of narration level
SFX:        70-100% of narration level (contextual)

dB Relationships:
-----------------
Narration:  -14 dB LUFS (dialogue standard)
Music bed:  -30 to -35 dB LUFS (under narration)
Music only: -16 dB LUFS (no narration sections)
SFX:        -18 to -20 dB LUFS

Volume Multiplier Quick Reference

Ratio Multiplier Use Case
100% 1.0 Full volume (narration)
70% 0.7 Prominent SFX
50% 0.5 Equal blend
30% 0.3 Noticeable background
20% 0.2 Subtle bed (recommended music)
15% 0.15 Minimal presence
10% 0.1 Barely audible

Basic ffmpeg Mixing Commands

Two-Track Mix (Narration + Music)

# Basic mix: narration at full, music at 15%
ffmpeg -i narration.mp3 -i music.mp3 \
  -filter_complex "[0:a]volume=1.0[narr];[1:a]volume=0.15[music];[narr][music]amix=inputs=2:duration=first" \
  -c:a aac -b:a 192k output.m4a

Three-Track Mix (Narration + Music + SFX)

ffmpeg -i narration.mp3 -i music.mp3 -i sfx.mp3 \
  -filter_complex "\
    [0:a]volume=1.0[narr];\
    [1:a]volume=0.15[music];\
    [2:a]volume=0.7[sfx];\
    [narr][music][sfx]amix=inputs=3:duration=first:weights='3 1 2'" \
  -c:a aac -b:a 192k output.m4a

Timing with adelay Filter

The adelay filter positions audio at precise timestamps.

Syntax

adelay=delays[|delays...][,all=1]
# delays: milliseconds or samples (with 'S' suffix)
# all=1: apply same delay to all channels

Position Music at Specific Time

# Start music at 5 seconds
ffmpeg -i narration.mp3 -i music.mp3 \
  -filter_complex "\
    [0:a]volume=1.0[narr];\
    [1:a]adelay=5000|5000,volume=0.15[music];\
    [narr][music]amix=inputs=2:duration=first" \
  output.m4a

Multiple Timed Audio Cues

# Narration starts at 0, music at 2s, SFX at 5.5s
ffmpeg -i narration.mp3 -i music.mp3 -i sfx.wav \
  -filter_complex "\
    [0:a]volume=1.0[narr];\
    [1:a]adelay=2000|2000,volume=0.15[music];\
    [2:a]adelay=5500|5500,volume=0.7[sfx];\
    [narr][music][sfx]amix=inputs=3:duration=longest" \
  output.m4a

Audio Ducking

Automatically lower music when speech is present.

Simple Sidechain Compression (Ducking)

ffmpeg -i narration.mp3 -i music.mp3 \
  -filter_complex "\
    [0:a]asplit=2[narr][sc];\
    [1:a][sc]sidechaincompress=threshold=0.02:ratio=10:attack=50:release=500[ducked];\
    [narr][ducked]amix=inputs=2:duration=first" \
  output.m4a

Parameters Explained

Parameter Value Effect
threshold 0.02 (default 0.125) Lower = more sensitive to speech
ratio 10:1 How much to reduce (10:1 = significant duck)
attack 50ms How fast to duck when speech starts
release 500ms How fast to return after speech ends
knee 2.82843 Softness of compression curve

Advanced Ducking with Precise Control

ffmpeg -i narration.mp3 -i music.mp3 \
  -filter_complex "\
    [0:a]asplit=2[narr][sc];\
    [1:a]volume=0.5[music_pre];\
    [music_pre][sc]sidechaincompress=\
      threshold=0.015:\
      ratio=15:\
      attack=30:\
      release=800:\
      makeup=1:\
      knee=6[ducked];\
    [narr][ducked]amix=inputs=2:duration=first:weights='1 0.4'" \
  output.m4a

Mix Ratios by Content Type

Content Type          | Narration | Music | SFX  | Notes
---------------------|-----------|-------|------|------------------
Tutorial/How-to      | 100%      | 10%   | 50%  | Voice clarity critical
Corporate/Business   | 100%      | 15%   | 60%  | Professional presence
Social Media         | 100%      | 20%   | 80%  | Higher energy
Documentary          | 100%      | 25%   | 100% | Cinematic feel
Promo/Advertising    | 100%      | 30%   | 100% | Impactful
Music Video          | 50%       | 100%  | 80%  | Music dominant
Podcast              | 100%      | 5%    | 30%  | Minimal distraction
E-learning           | 100%      | 8%    | 40%  | Focus on retention

Loudness Normalization (LUFS)

LUFS (Loudness Units Full Scale) is the broadcast standard for perceived loudness.

Target Levels by Platform

Platform Target LUFS True Peak Notes
YouTube -14 LUFS -1 dB TP Auto-normalized
Spotify -14 LUFS -1 dB TP Loudness penalty applied
Apple Music -16 LUFS -1 dB TP Sound Check
Broadcast TV -24 LUFS -2 dB TP EBU R128 standard
Podcast -16 to -19 LUFS -1 dB TP Apple spec
TikTok/Reels -14 LUFS -1 dB TP Mobile optimization

Loudness Normalization Command

# Normalize to -14 LUFS (YouTube/Spotify standard)
ffmpeg -i input.mp3 \
  -af loudnorm=I=-14:TP=-1:LRA=11 \
  -c:a aac -b:a 192k output.m4a

Two-Pass Normalization (More Accurate)

# Pass 1: Analyze
ffmpeg -i input.mp3 \
  -af loudnorm=I=-14:TP=-1:LRA=11:print_format=json \
  -f null - 2>&1 | grep -A 12 "output_i"

# Pass 2: Apply measured values
ffmpeg -i input.mp3 \
  -af loudnorm=I=-14:TP=-1:LRA=11:\
measured_I=-18.5:measured_TP=-2.3:measured_LRA=8.2:\
measured_thresh=-28.5:\
linear=true \
  -c:a aac -b:a 192k output.m4a

Multi-Track Production Pipeline

Complete Video Audio Mix

ffmpeg -i video.mp4 -i narration.wav -i music.mp3 -i sfx.wav \
  -filter_complex "\
    [1:a]volume=1.0,aformat=sample_fmts=fltp:sample_rates=48000:channel_layouts=stereo[narr];\
    [2:a]volume=0.15,aformat=sample_fmts=fltp:sample_rates=48000:channel_layouts=stereo[music];\
    [3:a]adelay=3000|3000,volume=0.7,aformat=sample_fmts=fltp:sample_rates=48000:channel_layouts=stereo[sfx];\
    [narr][music][sfx]amix=inputs=3:duration=first:normalize=0[mixed];\
    [mixed]loudnorm=I=-14:TP=-1:LRA=11[final]" \
  -map 0:v -map "[final]" \
  -c:v copy -c:a aac -b:a 192k \
  output.mp4

Audio-Only Master Mix

ffmpeg -i narration.wav -i music.mp3 -i intro_sfx.wav -i outro_sfx.wav \
  -filter_complex "\
    [0:a]volume=1.0[narr];\
    [1:a]volume=0.15[music];\
    [2:a]adelay=0|0,volume=0.8[intro];\
    [3:a]adelay=55000|55000,volume=0.8[outro];\
    [narr][music][intro][outro]amix=inputs=4:duration=longest:weights='3 1 2 2'[mix];\
    [mix]loudnorm=I=-14:TP=-1[final]" \
  -map "[final]" -c:a aac -b:a 256k master_audio.m4a

Quick Reference: Common Patterns

Pattern 1: Narration + Background Music

ffmpeg -i narration.mp3 -i music.mp3 \
  -filter_complex "[0:a]volume=1.0[n];[1:a]volume=0.15[m];[n][m]amix=inputs=2:duration=first" \
  output.m4a

Pattern 2: Music with Auto-Duck

ffmpeg -i narration.mp3 -i music.mp3 \
  -filter_complex "[0:a]asplit=2[n][sc];[1:a][sc]sidechaincompress=threshold=0.02:ratio=10[d];[n][d]amix=inputs=2" \
  output.m4a

Pattern 3: Timed Intro Music Fade

ffmpeg -i narration.mp3 -i intro_music.mp3 \
  -filter_complex "\
    [1:a]afade=t=out:st=8:d=2,volume=0.3[intro];\
    [0:a]adelay=10000|10000[narr];\
    [intro][narr]amix=inputs=2:duration=longest" \
  output.m4a

Pattern 4: Crossfade Between Segments

ffmpeg -i segment1.mp3 -i segment2.mp3 \
  -filter_complex "\
    [0:a]afade=t=out:st=28:d=2[s1];\
    [1:a]adelay=28000|28000,afade=t=in:d=2[s2];\
    [s1][s2]amix=inputs=2:duration=longest" \
  output.m4a

Troubleshooting

Issue Cause Solution
Clipping/distortion Combined levels too high Reduce individual volumes or add limiter
Narration buried Music too loud Reduce music to 10-15%, add ducking
Hollow/thin sound Phase cancellation Check mono compatibility
Pumping artifacts Aggressive ducking Increase attack/release times
Inconsistent levels No normalization Apply loudnorm filter

Add Limiter to Prevent Clipping

ffmpeg -i input.mp3 \
  -af "alimiter=level_in=1:level_out=0.9:limit=0.95:attack=5:release=50" \
  output.m4a

Related Skills

  • video-pacing: Video rhythm and timing patterns
  • remotion-composer: Programmatic video generation
  • demo-producer: Product demo video production
  • thumbnail-first-frame: Video thumbnail optimization

References

Weekly Installs
7
GitHub Stars
91
First Seen
Feb 2, 2026
Installed on
claude-code5
antigravity4
github-copilot4
gemini-cli4
opencode4
windsurf3