avatar-video
Avatar Video
Create AI avatar videos with full control over avatars, voices, scripts, scenes, and backgrounds. Build single or multi-scene videos with exact configuration using HeyGen's /v2/video/generate API.
Authentication
All requests require the X-Api-Key header. Set the HEYGEN_API_KEY environment variable.
curl -X GET "https://api.heygen.com/v2/avatars" \
-H "X-Api-Key: $HEYGEN_API_KEY"
Tool Selection
If HeyGen MCP tools are available (mcp__heygen__*), prefer them over direct HTTP API calls — they handle authentication and request formatting automatically.
| Task | MCP Tool | Fallback (Direct API) |
|---|---|---|
| Check video status / get URL | mcp__heygen__get_video |
GET /v2/videos/{video_id} |
| List account videos | mcp__heygen__list_videos |
GET /v2/videos |
| Delete a video | mcp__heygen__delete_video |
DELETE /v2/videos/{video_id} |
Video generation (POST /v2/video/generate) and avatar/voice listing are done via direct API calls — see reference files below.
Default Workflow
- List avatars —
GET /v2/avatars→ pick an avatar, preview it, noteavatar_idanddefault_voice_id. See avatars.md - List voices (if needed) —
GET /v2/voices→ pick a voice matching the avatar's gender/language. See voices.md - Write the script — Structure scenes with one concept each. See scripts.md
- Generate the video —
POST /v2/video/generatewith avatar, voice, script, and background per scene. See video-generation.md - Poll for completion —
GET /v2/videos/{video_id}until status iscompleted. See video-status.md
Quick Reference
| Task | Read |
|---|---|
| List and preview avatars | avatars.md |
| List and select voices | voices.md |
| Write and structure scripts | scripts.md |
| Generate video (single or multi-scene) | video-generation.md |
| Add custom backgrounds | backgrounds.md |
| Add captions / subtitles | captions.md |
| Add text overlays | text-overlays.md |
| Create transparent WebM video | video-generation.md (WebM section) |
| Use templates | templates.md |
| Create avatar from photo | photo-avatars.md |
| Check video status / download | video-status.md |
| Upload assets (images, audio) | assets.md |
| Use with Remotion | remotion-integration.md |
| Set up webhooks | webhooks.md |
When to Use This Skill vs Create Video
This skill is for precise control — you choose the avatar, write the exact script, configure each scene.
If the user just wants to describe a video idea and let AI handle the rest (script, avatar, visuals), use the create-video skill instead.
| User Says | Create Video Skill | This Skill |
|---|---|---|
| "Make me a video about X" | ✓ | |
| "Create a product demo" | ✓ | |
| "I want avatar Y to say exactly Z" | ✓ | |
| "Multi-scene video with different backgrounds" | ✓ | |
| "Transparent WebM for compositing" | ✓ | |
| "Use this specific voice for my script" | ✓ | |
| "Batch generate videos with exact specs" | ✓ |
Reference Files
Core Video Creation
- references/avatars.md - Listing avatars, styles, avatar_id selection
- references/voices.md - Listing voices, locales, speed/pitch
- references/scripts.md - Writing scripts, pauses, pacing
- references/video-generation.md - POST /v2/video/generate and multi-scene videos
Video Customization
- references/backgrounds.md - Solid colors, images, video backgrounds
- references/text-overlays.md - Adding text with fonts and positioning
- references/captions.md - Auto-generated captions and subtitles
Advanced Features
- references/templates.md - Template listing and variable replacement
- references/photo-avatars.md - Creating avatars from photos
- references/webhooks.md - Webhook endpoints and events
Integration
- references/remotion-integration.md - Using HeyGen in Remotion compositions
Foundation
- references/video-status.md - Polling patterns and download URLs
- references/assets.md - Uploading images, videos, audio
- references/dimensions.md - Resolution and aspect ratios
- references/quota.md - Credit system and usage limits
Best Practices
- Preview avatars before generating — Download
preview_image_urlso the user can see the avatar before committing - Use avatar's default voice — Most avatars have a
default_voice_idpre-matched for natural results - Fallback: match gender manually — If no default voice, ensure avatar and voice genders match
- Use test mode for development — Set
test: trueto avoid consuming credits (output will be watermarked) - Set generous timeouts — Video generation often takes 5-15 minutes, sometimes longer
- Validate inputs — Check avatar and voice IDs exist before generating