Compose

This atom owns the full video assembly pipeline. You receive raw outputs from the capture stage and produce a single polished artifact. Follow the pipeline below step by step.

Inputs

The command or capture stage should have provided a handoff with two sections:

Mechanical (structured)

clips: paths to raw recordings (.cast, .mp4, .webm, .png)
driver: tuistory | true-input | agent-browser
layout: single | side-by-side
labels: text for each clip (e.g., "BEFORE (dev)", "AFTER (PR)")
speed: multiplier (default 3x)
fidelity: auto | compact | standard | inspect (optional; auto => side-by-side=inspect, single=standard)
title: text for the title card
subtitle: one-sentence summary
sections: text banners for chapters [{t, title}] (optional)
keys: keystroke events [{t, label, dur?}] (if overlay requested)
showcase: preset name -- macos, minimal, hero, presentation, factory, factory-hero
effects tier: utilitarian | full | none (see "Choosing effects at compose time" below)
output: desired output path

Creative (natural language)

Free-text guidance on what to emphasize: which moments to hold, what the title card should convey, whether phase cards are warranted, how to trim dead time. Use this -- along with the effects tier -- for editorial decisions, including choosing specific effects to apply.

Pipeline

1. Build props   →  construct the Showcase JSON props
2. Render        →  render-showcase.sh (converts .cast, stages clips, renders, cleans up)
3. Finalize      →  verify and output

Remotion handles all compositing in a single pass — title cards, transitions, window chrome, backgrounds, keystroke overlays, spotlights, particles, noise, and color grading are all automatic. You construct the props JSON; the engine does the rest.

Remotion project & helper script

REMOTION_DIR=${DROID_PLUGIN_ROOT}/remotion
RENDER=${DROID_PLUGIN_ROOT}/scripts/render-showcase.sh

Showcase mode vs Demo mode

Both use the same Remotion pipeline but target different visual registers.

	Showcase	Demo
Goal	Cinematic, high-polish marketing material	Clear, utilitarian demonstration — single or comparison, whichever the story calls for
Preset	`factory`, `factory-hero`, or `hero`	`macos`, `minimal`, or `presentation`
Effects tier	Full -- spotlight, zoom, callout, keystroke overlay. Go all out.	Utilitarian -- zoom for readability, keystroke overlay for user actions
Audience	External — landing pages, social, marketing	Internal — PR reviews, docs, QA

Decision rule: If the video will be seen outside the eng team, use Showcase mode. If it's for a PR description, internal demo, or documentation embed, use Demo mode. The visual polish layers (warm glow, particles, color grade, motion blur) are always present but their intensity is palette-driven — Factory presets produce rich cinematic warmth while Catppuccin presets stay subtle and cool.

Choosing effects at compose time

The command stage committed an effects tier (utilitarian, full, or none). Now that you have actual recordings, choose specific effects:

Utilitarian: Add zoom effects for any small or hard-to-read text. Add keystroke overlay if user actions were captured. Skip spotlight and callout unless something is genuinely hard to find on screen.
Full: Use the full palette. Spotlight the key proof points. Zoom into details. Add callout annotations where the UI isn't self-explanatory. Layer keystroke overlay throughout. Aim for cinematic -- the viewer should feel guided, not left to scan.
None: Pass "effects": [] in props. Keystroke overlay is still allowed if committed separately.

Step 1: Choose fidelity and pacing

render-showcase.sh auto-selects inspect for side-by-side and standard for single-clip layouts when fidelity is omitted or set to auto.

Fidelity	Default output size	Remotion encode	Polish overlays	Best for
`compact`	1920x1080	H.264 CRF 21, JPEG frames	full grain + grade	Small embeds
`standard`	1920x1080	H.264 CRF 18, JPEG frames	reduced grain + grade	Single-panel demos
`inspect`	2560x1440	H.264 CRF 14, PNG frames	minimal grain + grade	Side-by-side comparisons / tiny text

.cast conversion behavior

render-showcase.sh converts .cast inputs through agg -> gif -> ffmpeg -> mp4 before Remotion render, using the asciicast's own cols/rows and fixed font metrics so element positions remain stable across fidelity profiles.

CRITICAL: agg replaces ALL 16 ANSI colors with its theme palette. The render script uses a custom Droid CLI theme. If you manually run agg, never omit --theme and never use built-in themes like monokai or dracula.

The --theme flag accepts a comma-separated hex string (no # prefix): bg,fg,color0..color7 (10 values) or bg,fg,color0..color7,color8..color15 (18 values for bright variants).

Note: tuistory recordings of the Droid CLI typically emit NO color escape codes -- the CLI uses Ink's direct rendering which doesn't produce standard ANSI SGR sequences in the cast output. The theme's bg (first value: 181818) and fg (second value: e0d0c0, warm white) are the only colors that will affect the output. The warm-white fg avoids the cold blue-grey look of default themes.

For other terminals that DO emit ANSI color codes, build the full theme string from the terminal's actual color settings.

Pacing: Target the final video duration, not a speed factor. A blind multiplier either makes text illegible or leaves dead air.

Demo type	Target duration	Why
Single feature, 3-5 steps	30-45s	Viewer watches the whole thing in one breath
Before/after comparison, side-by-side	45-75s	Each panel needs time to land; frozen-vs-active contrasts need a beat
Multi-phase or complex flow	60-120s	Phase cards give the viewer reset points; rushing defeats the purpose

Set the speed prop to hit the target: if the raw recording is 3 minutes and the target is 60s, use "speed": 3. If it's already 40s raw, use "speed": 1 or trim dead time instead. Trim first, speed second -- cut LLM thinking pauses, build waits, and network delays from the .cast with asciinema cut or by splitting segments, then apply a gentle speed-up only if still over target.

Keystroke timing adjustment: If a keystroke list was emitted during capture, its timestamps are in raw recording time. When you apply a speed multiplier, you must divide every timestamp by the speed factor before passing it to the Remotion props. A keystroke at raw t=6.0s in a 3x video should appear at t=2.0s.

Non-.cast clips

.mp4, .webm, and .png clips are passed through to Remotion unchanged except for staging into public/. Re-encode non-.cast clips manually only if their pixel format or dimensions are invalid.

Clip aspect ratio (mandatory check for browser captures)

At the default 1920×1080 output with factory preset margins, panels come out roughly:

Layout	Panel aspect
`single`	~1760×920 (≈16:9 landscape)
`side-by-side`	~872×920 per panel (≈8:9, near-square / slight portrait)

.cast conversions target panel aspect automatically. Pre-recorded .mp4 / .webm clips do not — if the clip aspect doesn't match the panel aspect, the clip will letterbox (with the default objectFit: "contain") or crop (with "cover").

Common pitfall: browser captures are typically 16:9 landscape (e.g. 1280×720). Dropped into a side-by-side layout they render as a thin band with giant black bars above and below.

Two fixes, in priority order:

Re-capture at a panel-friendly viewport — go back to the capture stage and set viewport to ~960×1000 for side-by-side, ~1280×720 for single.
Pass "objectFit": "cover" in props — crops the clip edges to fill the panel. Acceptable when the relevant UI is centered and edges are expendable. Not acceptable if cropped content matters (e.g. sidebar UI cut off).

.cast clips rarely need this since their rendered aspect is derived from cols/rows; it's almost always a browser-capture concern.

Duration checkpoint (mandatory, before proceeding)

Check whether the planned speed factor produces a final duration within the pacing table's target range:

final_duration = clip_duration / speed_factor

If final_duration is...	Action
Within the target range	Proceed
Below the minimum	Reduce the speed factor until the target is met. If even at 1x the clip is below the minimum, the recording is too short — return to capture and add more interaction steps.
Above the maximum	Trim dead time first (`asciinema cut`), then increase the `speed` prop

This checkpoint is not optional. A video that lands outside the target range fails verification.

Step 2: Build props

Choose layout

Default: single. One clip of the target/final state. New features, bug-fix proofs, walkthroughs, and README heroes all belong here.

Use side-by-side only when the story is fundamentally a comparison: regression (broken vs fixed), behavior-preserving refactor, or an explicit user request. Never fabricate a "before" clip to justify the side-by-side shape.

Save the showcaseSchema JSON to a temp file:

DEMO_TMP="$(mktemp -d /tmp/droid-demo-XXXXXX)"
PROPS="${DEMO_TMP}/showcase-props.json"

cat > "$PROPS" << 'EOF'
{
  "clips": ["demo.cast"],
  "layout": "single",
  "fidelity": "auto",
  "labels": [],
  "speed": 3,
  "title": "PR #11621 — Prevent session freezes",
  "subtitle": "Bash Mode blocks interactive commands and supports ESC cancellation",
  "preset": "factory",
  "keys": [
    {"t": 2.0, "label": "vim"},
    {"t": 5.5, "label": "sleep 100"},
    {"t": 8.0, "label": "Esc"}
  ],
  "sections": [],
  "effects": [],
  "speedNote": "3x speed",
  "windowTitle": "droid demo"
}
EOF

For a comparison flow, swap "clips" to two paths, "layout" to "side-by-side", and populate "labels" (e.g., ["BEFORE (main)", "AFTER (PR #11621)"]).

Use a run-scoped props path like $PROPS; do not reuse a global /tmp/showcase-props.json across rerenders or concurrent demos.

CRITICAL: clipDuration handling. The render script auto-detects clip duration via ffprobe when clipDuration is omitted from the props. If you set it manually, it must match the actual clip duration or you get blank frames (too long) or truncation (too short). When in doubt, omit it and let the script auto-detect.

Props reference

Prop	Type	Required	Description
`clips`	`string[]`	yes	Filenames (basenames only — the render script handles staging)
`layout`	`"single" \| "side-by-side"`	yes	Composition layout
`labels`	`string[]`	yes	Labels for each clip (visible in side-by-side; pass `[]` for single)
`fidelity`	`"compact" \| "standard" \| "inspect"`	no	Output quality/compression profile. Omit for auto-selection by layout.
`speed`	`number`	no	Playback speed for `.cast -> agg` conversion.
`title`	`string`	yes	Title card heading
`subtitle`	`string`	yes	Title card subheading
`preset`	preset name	yes	Visual preset — see table below
`keys`	`Keystroke[]`	yes	Keystroke overlay events (pass `[]` for none)
`sections`	`Section[]`	no	Section banners to mark chapters (pass `[]` for none)
`effects`	`Effect[]`	yes	Effect timeline (pass `[]` for none)
`clipDuration`	`number`	no	Clip duration in seconds. Auto-detected by render script if omitted.
`speedNote`	`string`	no	Shown on title card (e.g., `"3x speed"`)
`windowTitle`	`string`	no	Text in the window title bar
`width`	`number`	no	Output width (default: 2560 for inspect, else 1920)
`height`	`number`	no	Output height (default: 1440 for inspect, else 1080)
`objectFit`	`"contain" \| "cover" \| "fill"`	no	How each clip fits its panel. Default `"contain"` (letterbox to preserve aspect). Use `"cover"` when clip aspect doesn't match panel aspect and you'd rather crop than see black bars. See "Clip aspect ratio" below.
`codeAnnotations`	`CodeAnnotation[]`	no	Timed syntax-highlighted code overlays shown during the main content sequence. See "Code annotations" below.
`transitionStyle`	`"motion-blur" \| "flash" \| "whip-pan" \| "light-leak" \| "glitch-lite"`	no	Presentation used for title→content and content→outro transitions. Default `"motion-blur"` preserves existing aesthetic. See "Transition styles" below.

Preset quick reference

Preset	Look	Palette	Best for
`factory`	Warm black bg, traffic lights, 12px radius, 80px margin	Factory (warm)	Official Factory content
`factory-hero`	Same as factory + gradient bg	Factory (warm)	Factory landing pages, social
`hero`	Gradient bg, generous margins, prominent shadow	Catppuccin (cool)	Non-Factory marketing
`macos`	Dark bg, traffic lights, clean frame	Catppuccin (cool)	General-purpose demos
`presentation`	Black bg, generous margins	Catppuccin (cool)	Slide decks, talks
`minimal`	No window bar, tiny radius, tight margins	Catppuccin (cool)	Docs embeds, inline clips

Keystroke schema

{ t: number, label: string, dur?: number }

t: Time in seconds relative to clip start (not title card). Adjust for speed factor.
label: Display text (e.g., "Ctrl+C", "vim main.rs")
dur: Display duration in seconds (default: 1.2s). Auto-cut when next keystroke starts.

Section schema

{ "t": 2.0, "title": "Testing basic echo" }

t: Time in seconds relative to clip start (not title card). Adjust for speed factor.
title: Display text for the section header. Remains visible until the next section starts.

Effect types

Effect	Props	Description
`fade-in`	`t`, `dur`	Fade from black
`fade-out`	`t`, `dur`	Fade to black
`zoom`	`t`, `dur`, `to: {x,y,w,h}`	Directed zoom to a target region (30% in, 40% hold, 30% out)
`spotlight`	`t`, `dur`, `on: {x,y,w,h}`, `dim?`	Dim everything except a region (`dim`: 0–1, default 0.6)
`callout`	`t`, `dur`, `text`, `at: {x,y}`	Text overlay anchored to a point

Regions use percentage strings (e.g., "25%") relative to the video dimensions.

When to use effects

Effect	Use when…	Don't use when…
`spotlight`	Drawing attention to a specific region (error, status change)	The whole frame is relevant
`zoom`	Small text or detail that's hard to read at full scale	Content is already legible
`callout`	Annotating something the viewer might not recognize	The UI is self-explanatory
Keystroke overlay	Showing user actions (typing, key presses)	No user interaction in the clip

Less is more. One well-timed spotlight has more impact than five overlapping effects.

Code annotations

Timed syntax-highlighted code card laid over the captured video. Use for PR demos where the decisive source change needs to sit next to the runtime proof.

Field	Type	Required	Description
`t`	`number`	yes	Start time in seconds, relative to clip start. Adjust for `speed` factor (same rule as `keys[].t`).
`dur`	`number`	yes	How long the card stays visible, in seconds.
`code`	`string`	yes	Source text; `\n` for multiline; no trailing newline.
`language`	`string`	no	Prism language id (`tsx`, `ts`, `py`, `rust`, `bash`, ...). Default `tsx`.
`title`	`string`	no	Small caption above the code (usually a file path).
`highlight`	`[{start,end}]`	no	1-based inclusive line ranges with accent background + left border.
`focus`	`[{start,end}]`	no	1-based inclusive line ranges kept at full opacity; others are dimmed/blurred.
`position`	`"top-right" \| "center" \| "bottom-left"`	no	Default `"top-right"`. Move to `"bottom-left"` if the captured top-right is load-bearing.

Keep it short — aim for ≤ 15 lines per card, hold for 3–6 seconds.

Transition styles

transitionStyle selects the title→content and content→outro crossfade presentation. Both transitions in one render share the same style. flash and light-leak derive their tint from the preset palette. Default motion-blur is always safe; preset-tier guidance lives in showcase/SKILL.md.

Style	Feel	Use when…
`motion-blur`	Subtle dolly, blur + opacity crossfade	Default for PR demos, Factory content, most showcase work
`flash`	Quick palette-tinted flash at midpoint	Bug-fix proofs where the "after" state should feel sudden
`whip-pan`	Horizontal pan + motion blur	Energetic showcase / marketing when pacing is fast
`light-leak`	Warm gradient sweep	Factory-branded landing/social clips
`glitch-lite`	RGB channel offset + horizontal band	Security/vulnerability proof, terminal aesthetic; never default, never twice

Step 3: Render

Use the render script — it handles clip staging, duration detection, rendering, and cleanup:

RENDER=${DROID_PLUGIN_ROOT}/scripts/render-showcase.sh

# Basic render
$RENDER --props "$PROPS" --output /tmp/demo.mp4 \
  /tmp/before.cast /tmp/after.cast

# Or with inline props (useful for simple cases)
$RENDER --props-inline '{"clips":["clip.mp4"],"layout":"single","labels":[],"title":"Demo","subtitle":"Test","preset":"macos","keys":[],"effects":[]}' \
  --output /tmp/demo.mp4 /tmp/clip.mp4

The script:

Converts .cast inputs to .mp4 using the selected fidelity profile
Copies clip files into ${REMOTION_DIR}/public/
Auto-detects clipDuration via ffprobe if missing from props
Runs npx remotion render Showcase with profile-specific encode flags
Cleans up staged and generated clips

Quick frame check (sanity-check layout before full render):

cd ${REMOTION_DIR}
npx remotion still Showcase --props="$(cat "$PROPS")" --frame=30 --scale=0.5 /tmp/check.png

Render time: Expect ~1-3 minutes for a 30-60s video at 1920x1080. Set worker timeouts accordingly (5 minutes is safe).

Step 4: Finalize

Check the result:

ffprobe -v quiet -print_format json -show_format -show_streams /tmp/demo.mp4

Confirm:

Resolution is 1920x1080 (or matches the expected output)
Duration is reasonable (not 0s, not hours)
File size is manageable (under 5 MB for GitHub embeds, 25 MB hard limit)
Pixel format is yuv420p (universal playback)

Outputs

Hand to the verify stage:

## Compose outputs
- video: /tmp/demo-pr-11621.mp4
- resolution: 1920x1080
- duration: 42s
- size: 3.2 MB
- preset: factory
- keystrokes: 3 events overlaid
- effects: 1 spotlight
- engine: remotion

Screenshot-only artifacts (proofs, QA)

Not every deliverable is a video. For proof and QA workflows, compose may just organize screenshots and snapshots:

Annotated screenshot set

ffmpeg -y -i before.png -i after.png \
  -filter_complex "
    [0:v]scale=960:-1[left];
    [1:v]scale=960:-1[right];
    [left][right]hstack=inputs=2[out]" \
  -map "[out]" comparison.png

Markdown report with embedded evidence

For text-based deliverables, organize the evidence into a structured report rather than a video. The verify stage handles this.