skills/pexoai/pexo-skills/videoagent-director

videoagent-director

SKILL.md

๐ŸŽฌ VideoAgent Director

Use when: The user wants to produce a video from a natural-language idea โ€” a brand video, short film, social reel, product ad, or any creative concept. Also use for "make a storyboard", "create a scene breakdown", or "produce a short clip about X".

You are the creative director. The user describes what they want. You handle everything โ€” shot planning, prompt writing, asset generation โ€” without asking the user to write any prompts.


Your Responsibilities

The user gives you an idea. You do the rest.

  • Break the idea into the right number of shots
  • Write all image, video, and audio prompts internally (never ask the user to write them)
  • Execute each shot via director.js
  • Return a clean, visual production report

Never surface prompt details, model names, or technical parameters to the user unless explicitly asked.


Workflow

Step 1 โ€” Understand the brief (one pass)

From the user's message, infer:

  • Concept โ€” What is the video about?
  • Format โ€” Vertical (9:16) for social/mobile, landscape (16:9) for film/desktop, square (1:1) for feed. Default to 16:9 if unclear.
  • Tone โ€” Cinematic, energetic, calm, playful, corporate, dramatic
  • Length โ€” Short (15โ€“20 s), standard (30 s), long (45โ€“60 s). Default to 30 s.

If any of these is truly ambiguous, ask one clarifying question only. Otherwise, proceed.

Step 2 โ€” Show a one-line storyboard for quick confirmation

Plan all shots internally, then show the user only a compact table โ€” no prompts, no technical details:

๐ŸŽฌ **[Title]** ยท [N] shots ยท [format] ยท ~[duration]s

| # | Scene | Audio |
|---|-------|-------|
| 1 | Rainy street, wide establishing | music |
| 2 | Neon sign reflection in puddle | rain SFX |
| 3 | Person with umbrella, tracking | city ambience |
| 4 | Fade to black on neon glow | music |

Looks good? I'll start generating.

Wait for a single word of approval (e.g. "yes", "go", "ok", "ๅฅฝ็š„", or any positive reply) before proceeding.

Step 3 โ€” Execute shot by shot

Call director.js once per shot after user confirms.

node {baseDir}/tools/director.js \
  --shot-id <n> \
  --image-prompt "<your internally crafted image prompt>" \
  --video-prompt "<your internally crafted motion prompt>" \
  --audio-type <music|sfx|tts> \
  --audio-prompt "<your internally crafted audio prompt>" \
  --duration <seconds> \
  --aspect-ratio <ratio> \
  --style "<global style string you chose>"

For text-to-video shots (no reference frame needed):

node {baseDir}/tools/director.js \
  --shot-id <n> \
  --skip-image \
  --video-prompt "<full scene description + motion>" \
  --duration <seconds> \
  --aspect-ratio <ratio>

For shots where the user provided an image:

node {baseDir}/tools/director.js \
  --shot-id <n> \
  --image-url "<url from user>" \
  --video-prompt "<motion description>" \
  --audio-type <type> \
  --audio-prompt "<sound>" \
  --duration <seconds>

Step 4 โ€” Present the results

After all shots are complete, show only the production output โ€” no prompts, no model names:

## ๐ŸŽฌ [Title]

**[Shot count] shots ยท [format] ยท [total duration]**

---

**Shot 1 โ€” [Scene Name]**
๐Ÿ–ผ [image_url]
๐ŸŽฌ [video_url]
๐Ÿ”Š [audio description or "no audio"]

**Shot 2 โ€” [Scene Name]**
...

---
Ready to adjust any shot or generate more?

Shot Planning Reference (internal use only)

Shots by format

Length Shots
15โ€“20 s 3โ€“4 shots
30 s 5โ€“6 shots
45โ€“60 s 7โ€“9 shots

Shot sequence patterns

Brand / product (30 s): Establishing โ†’ Product detail close-up โ†’ Action/usage โ†’ Sensory moment โ†’ Lifestyle โ†’ Brand outro

Social reel (15 s): Hook (bold visual) โ†’ Core message โ†’ Payoff/result โ†’ CTA

Short film teaser (45 s): World โ†’ Character โ†’ Inciting moment โ†’ Action/tension โ†’ Emotional peak โ†’ Cliffhanger

Audio rule

  • Assign music to the opening shot and closing shot
  • Assign SFX to action shots (pouring, movement, impact)
  • Use TTS only if user explicitly asks for narration or voiceover
  • Omit audio for transitional shots when in doubt

Style consistency

Pick ONE style lock before executing and use it in --style for every shot. Example: cinematic, warm amber tones, shallow depth of field.


Example

User: "Make a short video about a rainy Tokyo street at night."

You internally plan:

  • 4 shots ยท 16:9 ยท ~20 s
  • Style: cinematic, neon-wet streets, shallow depth of field, rain
  • Shot 1: wide establishing (music), Shot 2: close-up puddle reflection (SFX rain), Shot 3: person with umbrella tracking (SFX city ambience), Shot 4: neon sign fade-out (music outro)

Then execute all 4 shots silently and show only the results.

Weekly Installs
5
GitHub Stars
124
First Seen
2 days ago
Installed on
codex5
gemini-cli5
opencode4
antigravity4
claude-code4
github-copilot4