video-agent
Video Agent - AI Content Generation Suite
A comprehensive AI content generation package providing a unified interface across 35+ models for image, video, and audio creation.
When to Use This Skill
- Text-to-image generation
- Image-to-image transformations
- Text-to-video creation
- Image-to-video animation
- Professional text-to-speech
- Multi-step content pipelines
- Batch content generation
Supported Providers
FAL AI
- FLUX models (text-to-image)
- Image transformations
- Fast inference
Google Vertex AI
- Imagen 4 (text-to-image)
- Veo (text-to-video)
- High quality outputs
ElevenLabs
- 20+ voice options
- Professional TTS
- Multiple languages
OpenRouter
- Access to various LLMs
- Text generation
- Content writing
Core Capabilities
Image Generation
Generate image:
Prompt: "A serene Japanese garden at sunset"
Model: flux-pro
Size: 1024x1024
Style: photorealistic
Available Models:
- FLUX Pro/Dev (FAL)
- Imagen 4 (Google)
- Stable Diffusion variants
Video Creation
Generate video:
Prompt: "Ocean waves crashing on rocky shore"
Model: veo
Duration: 5 seconds
Resolution: 1080p
Available Models:
- Google Veo
- MiniMax Hailuo
- Kling
Image-to-Video
Animate image:
Source: /path/to/image.png
Motion: "gentle zoom out with particle effects"
Duration: 4 seconds
Text-to-Speech
Generate audio:
Text: "Welcome to our product demo..."
Voice: professional-female-1
Speed: 1.0
Output: welcome.mp3
Voice Options:
- Professional male/female
- Casual conversational
- Narrator styles
- Multiple accents
Pipeline Orchestration
YAML Configuration
pipeline: product-demo
steps:
- name: generate-logo
type: image
model: flux-pro
prompt: "Modern tech logo for AI startup"
- name: create-intro
type: video
model: veo
prompt: "Logo animation reveal"
- name: add-voiceover
type: audio
model: elevenlabs
text: "Introducing the future of AI..."
voice: professional-male
- name: combine
type: merge
inputs: [create-intro, add-voiceover]
JSON Configuration
{
"pipeline": "social-content",
"parallel": true,
"steps": [
{
"type": "image",
"variants": 4,
"prompt": "Product hero shot"
}
]
}
Cost Management
Real-time Estimation
Estimate cost for:
- 10 images (1024x1024)
- 2 videos (5 seconds)
- 1 audio (60 seconds)
Estimated: $2.45
Budget Limits
budget:
max_per_job: $5.00
max_daily: $50.00
alert_threshold: 80%
Performance Features
Parallel Execution
Generate 10 image variants in parallel
Threads: 4
Expected speedup: 2-3x
Caching
- Automatic prompt caching
- Reuse similar generations
- Reduce redundant API calls
CLI Commands
# Image generation
video-agent image "prompt" --model flux-pro --size 1024
# Video generation
video-agent video "prompt" --model veo --duration 5
# Audio generation
video-agent audio "text" --voice professional-female
# Pipeline execution
video-agent pipeline config.yaml
# Cost check
video-agent cost --estimate
Python API
from video_agent import ImageGenerator, VideoGenerator
# Generate image
img = ImageGenerator(model="flux-pro")
result = img.generate("sunset over mountains")
# Generate video
vid = VideoGenerator(model="veo")
result = vid.generate("timelapse of clouds")
Setup
1. Install Package
pip install video-agent-claude-skill
2. Configure API Keys
export FAL_API_KEY="your-key"
export GOOGLE_VERTEX_KEY="your-key"
export ELEVENLABS_API_KEY="your-key"
3. Verify Setup
video-agent status
Use Cases
- Marketing: Product images, promo videos
- Social Media: Content at scale
- Education: Explainer videos, voiceovers
- Prototyping: Visual concepts, mockups
- Automation: Batch content pipelines
Credits
Created by donghaozhang. Licensed under MIT.
More from founderjourney/claude-skills
superpowers
Structured software development framework for coding agents. Composable skills enforcing planning, testing, and systematic execution. Prevents code-first chaos.
136firecrawl
Web scraping, search, and data extraction using Firecrawl API. Use when users need to fetch web content, discover URLs on sites, search the web, or extract structured data from pages.
91obsidian-skills
Agent skills for creating and editing Obsidian-compatible files. Supports Obsidian Flavored Markdown, Bases (.base), and JSON Canvas (.canvas) formats.
86saas-business-logic-analyst
|
62seo-geo-skills
Comprehensive SEO and GEO (Generative Engine Optimization) skills covering research, building, optimization, and monitoring phases. 16 specialized skills for search optimization.
52saas-architecture-deep-dive
Dominio de arquitectura SaaS para Senior Full-Stack Developer. Usar cuando el usuario necesite explicar arquitectura SaaS, multi-tenancy, disenar sistemas escalables, evaluar trade-offs arquitectonicos, o preparar respuestas sobre diseño de sistemas. Activa con palabras como SaaS, multi-tenant, arquitectura, escalabilidad, tenant isolation, subscription, billing, feature flags, scaling. Especializado en plataformas como HostelOS y Digitaliza.
45