edit-video

Installation
SKILL.md

Edit Video

Conversational video editing in three phases: transcribe, plan the edit, render.

Prerequisites

  • bun — TypeScript runtime
  • ffmpeg / ffprobe — video processing (must be on PATH)

Environment variables (optional, have defaults):

Variable Purpose Default
AUDETIC_API_URL Audetic transcription service URL https://audio.audetic.link

Commands

All paths are relative to this skill's directory. Run with bun run.

Command Purpose
scripts/transcribe.ts <video> Transcribe a single video file — produces JSON + markdown transcript + analysis alongside the file
scripts/transcribe.ts <directory> Transcribe all video files in a directory — produces merged JSON + markdown transcript + analysis in the directory
scripts/preview.ts <edl.json> Validate and preview an EDL before rendering
scripts/render.ts <edl.json> Render final video from EDL using ffmpeg stream copy
scripts/adjust.ts <edl.json> --segment N --start/--end <value> Adjust timestamps or remove segments in an existing EDL
scripts/caption.ts <video> <transcript.json> Burn Shorts-style captions into video (optional --edl, --output)

Workflow — Single File

1. Transcribe

bun run scripts/transcribe.ts <video-file>

Compresses audio to MP3, uploads to the audetic transcription service, and produces three files alongside the video:

  • .json — transcription result (used by tools)
  • -transcript.md — readable transcript table
  • -analysis.md — transcript with signal flags (gaps, speech rate)

2. Plan the Edit

Read both the transcript and the analysis. The analysis provides mechanical signals:

Flag Meaning
gap:Xs Silence of X seconds before this segment
slow Fewer than 0.5 words/second (typing, long pauses, dead air)
silence Zero words detected in segment
WPS column Words per second (higher = denser speech)

See references/signal-interpretation.md for detailed signal guidance.

Use these signals combined with your understanding of the content to decide what to keep/cut. The tool detects silence and pacing; you judge what's filler vs. substance.

When the edit plan is decided, write an EDL (Edit Decision List) JSON file:

{
  "output": "/absolute/path/to/output.mp4",
  "segments": [
    { "source": "/absolute/path/to/video.mp4", "start": "00:00:00.000", "end": "00:02:15.500", "label": "Introduction" },
    { "source": "/absolute/path/to/video.mp4", "start": "00:05:30.000", "end": "00:12:45.200", "label": "Main discussion" }
  ]
}
  • Each segment requires a source field with the absolute path to its source video
  • Timestamps must be HH:MM:SS.mmm format (ffmpeg-native)
  • Segments are in playback order — rearranging segments reorders the output
  • label is optional, helps communicate what each segment is
  • Only included segments appear in the output; everything else is cut

See references/edl-schema.md for the full schema reference.

3. Preview

bun run scripts/preview.ts <edl.json>

Always preview before rendering. Shows segment breakdown, kept/cut percentages, and validates the EDL.

3a. Adjust (Iterative Editing)

After previewing, use adjust.ts for targeted tweaks instead of rewriting the EDL:

# Trim 0.5s off the end of segment 1
bun run scripts/adjust.ts <edl.json> --segment 1 --end -0.5s

# Move the start of segment 2 earlier by 1 second
bun run scripts/adjust.ts <edl.json> --segment 2 --start -1s

# Extend the end of segment 3 by 2 seconds
bun run scripts/adjust.ts <edl.json> --segment 3 --end +2s

# Set end of segment 1 to an exact timestamp
bun run scripts/adjust.ts <edl.json> --segment 1 --end 00:05:30.000

# Remove segment 2 entirely
bun run scripts/adjust.ts <edl.json> --remove 2

Segment numbers are 1-based, matching the preview output. Relative deltas use +Xs/-Xs (supports decimals). Absolute values use HH:MM:SS.mmm format. The tool validates the result and prints an updated summary.

Repeat preview + adjust as needed until the edit is right, then render.

4. Render

bun run scripts/render.ts <edl.json>

Uses ffmpeg stream copy (fast, cuts at nearest keyframe). Produces the final video.

5. Caption (Optional)

bun run scripts/caption.ts <edited-video.mp4> <transcript.json> --edl <edl.json>

Burns bold, centered captions (Hormozi style) into the video. Use --edl to remap transcript times to the edited video's timeline. Requires a full re-encode.

See references/caption-style.md for style defaults and customization.

Workflow — Multiple Clips (Directory)

Use this when the user has multiple short clips that should be edited into a single video.

1. Transcribe All Clips

bun run scripts/transcribe.ts <directory>

Finds all video files (*.mp4, *.mkv, *.mov, *.webm, *.ts), transcribes each one, and produces merged output in the directory:

  • transcript.json — merged transcript with source field on each segment
  • transcript.md — merged readable table with Source column
  • analysis.md — merged analysis with Source column (gap detection resets at clip boundaries)

2. Plan the Edit

Same process as single-file, but the transcript and analysis include a Source column showing which clip each segment came from. Write an EDL with per-segment source paths:

{
  "output": "/absolute/path/to/combined.mp4",
  "segments": [
    { "source": "/absolute/path/to/clip001.mp4", "start": "00:00:02.000", "end": "00:00:12.000", "label": "Opening" },
    { "source": "/absolute/path/to/clip003.mp4", "start": "00:00:00.000", "end": "00:00:08.500", "label": "Key moment" },
    { "source": "/absolute/path/to/clip007.mp4", "start": "00:00:01.000", "end": "00:00:14.000", "label": "Closing" }
  ]
}

3. Preview + Render

Same as single-file workflow. Preview lists all sources with durations and shows source filename per segment.

4. Caption (Optional)

Same as single-file — run caption.ts with --edl on the rendered output.

Narrative Editing

Use narrative editing when the user provides a goal beyond "trim the filler" — a theme, tone, target duration, or audience.

See references/narrative-patterns.md for the full pattern catalog.

Thinking process:

  1. Read transcript + analysis. Identify distinct moments / content beats.
  2. Decide which moments serve the stated narrative goal.
  3. Determine the best order — chronological is one option, but also consider: Hook-first, Escalation, Question-answer, Bookend.
  4. Label each segment with its narrative role.
  5. Sum segment durations to verify against target.

Labels as narrative roles: Use the label field to document function (e.g., "HOOK: the punchline", "SETUP: context", "PAYOFF: resolution").

Narrative notes: Use the optional narrative_notes field in the EDL to document editorial reasoning.

Example:

{
  "output": "/path/to/output.mp4",
  "narrative_notes": "Goal: 60s punchy clip. Led with the reaction for hook, then backed into the setup.",
  "segments": [
    { "source": "/path/to/video.mp4", "start": "00:05:30.000", "end": "00:05:55.000", "label": "HOOK: surprised reaction" },
    { "source": "/path/to/video.mp4", "start": "00:01:00.000", "end": "00:02:15.500", "label": "SETUP: reading the tweet" },
    { "source": "/path/to/video.mp4", "start": "00:06:00.000", "end": "00:06:30.000", "label": "PAYOFF: final take" }
  ]
}

Output File Naming

Single-file mode

The outputPaths function in scripts/lib/config.ts generates standard paths relative to the video:

Output Pattern
Transcript JSON <name>.json
Transcript MD <name>-transcript.md
Analysis MD <name>-analysis.md
EDL <name>-edl.json
Edited video <name>-edited.mp4
Captioned video <name>-captioned.mp4

Directory mode

The directoryOutputPaths function generates paths inside the directory:

Output Pattern
Transcript JSON transcript.json
Transcript MD transcript.md
Analysis MD analysis.md

Session Flow

Single file

  1. User provides a video file path
  2. Run transcribe.ts on it
  3. Read the -analysis.md and -transcript.md files
  4. Discuss with user what to keep/cut (or accept a narrative goal)
  5. Write the EDL JSON file (each segment has source pointing to the video)
  6. Run preview.ts to validate — review with user 6a. If adjustments needed, run adjust.ts for targeted tweaks (repeat preview + adjust as needed)
  7. Run render.ts to produce the final video
  8. (Optional) Run caption.ts with --edl if user wants Shorts-style captions
  9. Report output path and final duration

Multiple clips

  1. User provides a directory of clips
  2. Run transcribe.ts on the directory
  3. Read the merged analysis.md and transcript.md in the directory
  4. Discuss with user which clips/segments to include
  5. Write the EDL JSON file (each segment has source pointing to its clip)
  6. Run preview.ts to validate — review with user 6a. If adjustments needed, run adjust.ts for targeted tweaks (repeat preview + adjust as needed)
  7. Run render.ts to produce the combined video
  8. (Optional) Run caption.ts with --edl if user wants Shorts-style captions
  9. Report output path and final duration

Tips

  • Keyframe imprecision: Stream copy cuts at the nearest keyframe, so cuts may be off by up to ~0.5s. This is the tradeoff for fast rendering without re-encoding.
  • Large videos: Transcription time scales with video length. For videos over 30 minutes, warn the user it may take a while.
  • Always preview first: Never render without previewing. The preview catches validation errors and lets the user confirm before committing.
  • Duration targeting: When given a target duration, sum the Dur column values from the analysis for selected segments. Iterate until the EDL fits.
  • Reinterpret signals: Gaps aren't just cut candidates — they mark topic boundaries. Slow segments aren't always boring — a pause before a realization can be dramatic.
  • Mixed codecs: When combining clips from different sources, the preview tool warns about mixed file extensions. Clips from the same device/app are usually safe.
  • Caption re-encoding: Burning captions requires a full video encode (not stream copy), so it takes longer than rendering. Mention this to the user before starting.
  • Iterative adjustments: After the initial EDL is created, use adjust.ts for minor tweaks like trimming or extending segments. It handles timestamp arithmetic so you don't need to rewrite the EDL for small changes.
  • Offer captions: When the user mentions Shorts, Reels, TikTok, or short-form content, offer to add Shorts-style captions after rendering.
Related skills
Installs
12
First Seen
Mar 8, 2026