alicloud-ai-audio-cosyvoice-voice-design
SKILL.md
Category: provider
Model Studio CosyVoice Voice Design
Use the CosyVoice voice enrollment API to create designed voices from a natural-language voice description.
Critical model names
Use model="voice-enrollment" and one of these target_model values:
cosyvoice-v3.5-pluscosyvoice-v3.5-flashcosyvoice-v3-pluscosyvoice-v3-flash
Recommended default in this repo:
target_model="cosyvoice-v3.5-plus"
Region and compatibility
cosyvoice-v3.5-plusandcosyvoice-v3.5-flashare available only in China mainland deployment mode (Beijing endpoint).- In international deployment mode (Singapore endpoint),
cosyvoice-v3-plusandcosyvoice-v3-flashdo not support voice clone/design. - The
target_modelmust match the later speech synthesis model.
Endpoint
- Domestic:
https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization - International:
https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization
Prerequisites
- Set
DASHSCOPE_API_KEYin your environment, or adddashscope_api_keyto~/.alibabacloud/credentials.
Normalized interface (cosyvoice.voice_design)
Request
model(string, optional): fixed tovoice-enrollmenttarget_model(string, optional): defaultcosyvoice-v3.5-plusprefix(string, required): letters/digits only, max 10 charsvoice_prompt(string, required): max 500 chars, Chinese or English onlypreview_text(string, required): max 200 chars, Chinese or Englishlanguage_hints(array[string], optional):zhoren, and should matchpreview_textsample_rate(int, optional): e.g.24000response_format(string, optional): e.g.wav
Response
voice_id(string)request_id(string)status(string, optional)
Operational guidance
- Keep
voice_promptconcrete: timbre, age range, pace, emotion, articulation, and scenario. - If
language_hintsis used, it should match the language ofpreview_text. - Designed voice names include a
-vd-marker in the generated backend naming convention.
Local helper script
Prepare a normalized request JSON:
python skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/prepare_cosyvoice_design_request.py \
--target-model cosyvoice-v3.5-plus \
--prefix announcer \
--voice-prompt "沉稳的中年男性播音员,低沉有磁性,语速平稳,吐字清晰。" \
--preview-text "各位听众朋友,大家好,欢迎收听晚间新闻。" \
--language-hint zh
Validation
mkdir -p output/alicloud-ai-audio-cosyvoice-voice-design
for f in skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/*.py; do
python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt
Pass criteria: command exits 0 and output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt is generated.
Output And Evidence
- Save artifacts, command outputs, and API response summaries under
output/alicloud-ai-audio-cosyvoice-voice-design/. - Include
target_model,prefix,voice_prompt, andpreview_textin the evidence file.
References
references/api_reference.mdreferences/sources.md
Weekly Installs
27
Repository
cinience/alicloud-skillsGitHub Stars
354
First Seen
5 days ago
Security Audits
Installed on
gemini-cli26
github-copilot26
codex26
kimi-cli26
amp26
cline26