skills/cinience/alicloud-skills/aliyun-videoretalk

aliyun-videoretalk

Installation

SKILL.md

Category: provider

Model Studio VideoRetalk

Validation

mkdir -p output/aliyun-videoretalk
python -m py_compile skills/ai/video/aliyun-videoretalk/scripts/prepare_retalk_request.py && echo "py_compile_ok" > output/aliyun-videoretalk/validate.txt

Pass criteria: command exits 0 and output/aliyun-videoretalk/validate.txt is generated.

Output And Evidence

Save normalized request payloads, target face selection settings, and task polling snapshots under output/aliyun-videoretalk/.
Record the exact video/audio input URLs and whether video_extension was enabled.

Use VideoRetalk when the input is already a person video and the job is to replace lip sync with a new speech track.

Critical model names

Use this exact model string:

videoretalk

Prerequisites

This model currently only supports China mainland (Beijing).
API is HTTP async only; there is no online console experience.
Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.

Normalized interface (video.retalk)

Request

model (string, optional): default videoretalk
video_url (string, required)
audio_url (string, required)
ref_image_url (string, optional): target face when input video contains multiple faces
video_extension (bool, optional): extend video to match longer audio
query_face_threshold (int, optional): 120 to 200

Response

task_id (string)
task_status (string)
video_url (string, when finished)
usage (object, optional)

Endpoint and execution model

Submit task: POST https://dashscope.aliyuncs.com/api/v1/services/aigc/image2video/video-synthesis/
Poll task: GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}
HTTP calls are async only and must set header X-DashScope-Async: enable.

Quick start

python skills/ai/video/aliyun-videoretalk/scripts/prepare_retalk_request.py \
  --video-url "https://example.com/talking-head.mp4" \
  --audio-url "https://example.com/new-voice.wav" \
  --video-extension

Operational guidance

Keep input videos front-facing and close enough for stable face tracking.
If the video contains multiple faces, provide ref_image_url to anchor the intended target.
If the new audio is longer than the input video, decide explicitly whether to extend the picture track or truncate the audio.
URLs must be public HTTP/HTTPS links; local file paths are not accepted by the API.

Output location

Default output: output/aliyun-videoretalk/request.json
Override base dir with OUTPUT_DIR.

References

references/sources.md

Weekly Installs

33

Repository

cinience/alicloud-skills

GitHub Stars

383

First Seen

1 day ago

Security Audits

Gen Agent Trust HubPass