beat-sync-video-editing
Beat-Sync Video Editing
Purpose
Provide domain expertise for creating beat-synced video edits: taking a source video and an audio track, selecting clips from the video that align with the music's rhythm, and rendering the final output with FFmpeg.
Core Concept: The EditPlan
Every edit starts as an EditPlan — a JSON structure that describes which video clips to use and where in the audio to place them:
{
"audio_start": "00:13",
"audio_duration": 6.5,
"clips": [
{ "video_start": "00:08", "duration": 2.0, "description": "Opening shot" },
{ "video_start": "00:45", "duration": 1.5, "description": "Action moment" },
{ "video_start": "01:22", "duration": 3.0, "description": "Build-up" }
],
"reasoning": "Matches rising intensity with beat drops"
}
Timestamp format: audio_start and video_start use MM:SS strings (e.g. "01:15" for 1 minute 15 seconds). audio_duration and clip duration use numbers in seconds.
Critical constraints:
audio_startmust be valid MM:SS format, non-negativeaudio_durationmust be positive (seconds)- Every clip must have valid MM:SS
video_startand positiveduration(seconds) - Sum of all clip durations must equal
audio_duration(within 0.5s tolerance) - Clip order is intentional — not necessarily chronological. Non-linear ordering creates dynamic edits.
Workflow: From Files to Final Video
Step 1: Generate EditPlan via Gemini
Run the Gemini script to analyze video + audio and produce a plan:
# Fresh upload (files auto-deleted after):
bash ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh \
--video <video_path> \
--audio <audio_path> \
--prompt "<user's edit description>"
# Keep files for reuse (outputs ECLIPTIC_FILES JSON to stderr):
bash ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh \
--video <video_path> \
--audio <audio_path> \
--prompt "<description>" --no-cleanup
# Reuse previously uploaded files (skips upload, much faster):
bash ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh \
--video-uri <uri> --video-mime <mime> \
--audio-uri <uri> --audio-mime <mime> \
--prompt "<different description>"
- Outputs EditPlan JSON to stdout, progress to stderr
- Requires
GEMINI_API_KEY,curl, andjq - Gemini watches the video and listens to the audio simultaneously
- Use
--no-cleanup+ reuse mode for fast iteration with different prompts
Step 2: Validate the Plan
echo '<plan_json>' | bash ${CLAUDE_PLUGIN_ROOT}/scripts/validate-plan.sh
- Outputs
{"valid": true, "errors": []}or{"valid": false, "errors": [...]} - Exit code 0 = valid, 1 = invalid
Step 3: Build FFmpeg Filters
echo '<plan_json>' | bash ${CLAUDE_PLUGIN_ROOT}/scripts/build-filter.sh
- Outputs
{"videoFilter": "...", "audioFilter": "...", "fullFilter": "..."} - The
fullFilterfield is what goes into FFmpeg's-filter_complexargument
Step 4: Render with FFmpeg
ffmpeg -y -i "<video_path>" -i "<audio_path>" \
-filter_complex "<fullFilter>" \
-map "[outv]" -map "[outa]" \
-c:v libx264 -preset fast -crf 23 \
-c:a aac -shortest \
"<output_path>"
FFmpeg Filter Anatomy
For a 3-clip edit, the fullFilter looks like:
[0:v]trim=start=8.000:duration=2.000,setpts=PTS-STARTPTS[v0];
[0:v]trim=start=45.000:duration=1.500,setpts=PTS-STARTPTS[v1];
[0:v]trim=start=22.000:duration=3.000,setpts=PTS-STARTPTS[v2];
[v0][v1][v2]concat=n=3:v=1:a=0[outv];
[1:a]atrim=start=13.000:duration=6.500,asetpts=PTS-STARTPTS[outa]
[0:v]= first input (video),[1:a]= second input (audio)trimextracts a segment,setpts=PTS-STARTPTSresets timestampsconcatjoins all video segments in orderatrimextracts the audio section
For the full FFmpeg filter reference, see references/ffmpeg-filters.md.
Troubleshooting
Duration mismatch error: Clip durations don't sum to audio_duration. Fix by adjusting the last clip's duration to absorb the difference, or re-run Gemini with a stricter prompt.
FFmpeg "Error" in stderr: FFmpeg writes progress and warnings to stderr. Only treat it as a real error if the output file wasn't created. Check for actual error patterns like No such file, Invalid data, or Conversion failed.
Gemini returns poor clips: Add specificity to the prompt. Instead of "make an edit", say "make a fast 30-second action edit, cut every 1-2 seconds on the beat drops, start from the chorus".
Additional Resources
Reference Files
references/ffmpeg-filters.md— Detailed FFmpeg filter_complex syntax, encoding options, common flagsreferences/edit-plan-schema.md— Full EditPlan JSON schema, validation rules, edge cases
Scripts
${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh— Upload to Gemini via REST API, get EditPlan (supports--no-cleanupand file reuse)${CLAUDE_PLUGIN_ROOT}/scripts/validate-plan.sh— Validate EditPlan JSON${CLAUDE_PLUGIN_ROOT}/scripts/build-filter.sh— Convert EditPlan to FFmpeg filters${CLAUDE_PLUGIN_ROOT}/scripts/cleanup-gemini-files.sh— Delete uploaded files from Gemini when done iterating