iopho-audio-director

Plan and produce the complete audio layer for product videos: BGM + VO + SFX → master audio.

Modes

Mode 1: `plan` — Generate audio strategy

Read the project's storyboard and context, then output an audio-plan.md with:

1. Architecture Decision

Total duration: {X}s @ {fps}fps = {frames} frames
VO coverage: {Y}% of scenes have voiceover
BGM strategy: {continuous bed | scene-matched segments | minimal/ambient}
SFX density: {minimal | moderate | rich}

2. BGM Specification

Genre, BPM, mood, instrument palette
Suno prompt (use template from references/audio-recipes.md §3)

Scene-by-scene music energy map:

Scene 1 (0:00-0:08): Low energy — ambient pad, space for hook
Scene 2 (0:08-0:15): Rising — add percussion, build
Scene 3 (0:15-0:30): Peak — full arrangement for demo
...

3. VO Timing Table

Scene	Start (s)	Start (f)	VO Text	Est. Duration	Engine
hook-1a	5.3	160	"..."	3.5s	ElevenLabs

4. SFX Placement

Timestamp	SFX	Trigger	Duration
0:05	UI click	Button press in demo	0.3s

5. Duck Schedule

Time Range	BGM Level	Reason
0:00-0:05	0dB	Music-only intro
0:05-0:09	-18dB	VO speaking
0:09-0:12	-12dB	Instrumental visual

Mode 2: `assemble` — Execute the audio plan

Takes individual audio files and combines them into the master audio track.

Inputs needed:

bgm.mp3 — background music (trimmed to video length)
master-vo.mp3 — assembled voiceover (from /iopho-voiceover-tts assemble)
SFX files (if any) — with target timestamps

Assembly sequence:

Trim BGM to video length with fade-out:

ffmpeg -i bgm-full.mp3 -t {duration} -af "afade=t=out:st={duration-3}:d=3" bgm.mp3

Duck BGM under VO (sidechain compress): See references/audio-recipes.md §2b for the ffmpeg command.
Layer SFX at timestamps: See references/audio-recipes.md §2d for adelay placement.
Normalize to platform target:
- YouTube: -16 LUFS
- Social media: -14 LUFS See references/audio-recipes.md §2c.

Export master audio:

{project}/audio/master-audio.mp3       — final mix
{project}/audio/master-audio-loud.mp3  — social media version (-14 LUFS)

Mode 3: `analyze` — Inspect existing audio

Run scripts/analyze_audio.py on any audio file to get:

BPM (global + dynamic), beats, onsets
Energy envelope, key moments
Structural sections

python3 scripts/analyze_audio.py path/to/audio.mp3 --output analysis.json
python3 scripts/analyze_audio.py path/to/audio.mp3 --gemini  # optional AI analysis

Use analysis output to:

Match BGM tempo to scene cuts
Verify duck timing aligns with VO segments
Find beat-sync points for MG animations

Decision Guide

Question	Answer	Action
Have a storyboard?	Yes	Start with `plan` mode
Have a storyboard?	No	Run `/iopho-video-director` Phase 1 first
Need BGM?	Generate new	Use Suno prompt template from recipes §3
Need BGM?	Have a track	Skip to `assemble` mode
Need VO?	Yes	Run `/iopho-voiceover-tts` first, then `assemble`
Need VO?	No (music video / MG)	Simpler: just trim + normalize BGM
Platform?	YouTube	Target -16 LUFS
Platform?	TikTok/Reels	Target -14 LUFS (louder)

Proven Values

These settings are battle-tested in production (78s explainer, 10 VO segments):

Parameter	Value	Source
VO duck level	-18dB	{project}-audio-plan.md
Instrumental duck	-12dB	{project}-audio-plan.md
Duck fade-down	200ms	tested in ffmpeg
Duck fade-up	500ms	tested in ffmpeg
VO rate-limit (ElevenLabs)	0.5s between calls	API throttle
MP3 bitrate	128-192kbps	balance quality/size
Sample rate	44100Hz	standard (48kHz for App Store)
FPS constant	30	project-wide

Audio Recipes Reference

See references/audio-recipes.md for:

§1: Duck levels table (full reference)
§2: FFmpeg commands (duck, sidechain, normalize, SFX, trim, analyze)
§3: Suno music prompt templates with examples (reader app + CLI tool)
§4: SFX catalog
§5: 3-layer architecture diagram
§6: Platform-specific audio specs

Related Skills

/iopho-voiceover-tts — produces master-vo.mp3 that this skill ducks under BGM
/iopho-video-director — calls this at Phase 1 (plan) and Phase 2 (assemble)
/suno-music-creator — generates BGM from Suno prompts (external skill)
/remotion-best-practices → rules/audio.md — integrates master-audio into Remotion
/iopho-product-context — reads context.md for brand tone → influences BGM mood

iopho-audio-director

iopho-audio-director

Modes

Mode 1: `plan` — Generate audio strategy

Mode 2: `assemble` — Execute the audio plan

Mode 3: `analyze` — Inspect existing audio

Decision Guide

Proven Values

Audio Recipes Reference

Related Skills

More from iopho-team/skills

iopho-searching-videos

iopho-analyzing-videos

reedle

iopho-getting-videos

iopho-audio-director

iopho-audio-director

Modes

Mode 1: plan — Generate audio strategy

Mode 2: assemble — Execute the audio plan

Mode 3: analyze — Inspect existing audio

Decision Guide

Proven Values

Audio Recipes Reference

Related Skills

More from iopho-team/skills

iopho-searching-videos

iopho-analyzing-videos

reedle

iopho-getting-videos

Mode 1: `plan` — Generate audio strategy

Mode 2: `assemble` — Execute the audio plan

Mode 3: `analyze` — Inspect existing audio