skills/cinience/alicloud-skills/alicloud-ai-audio-asr-realtime

alicloud-ai-audio-asr-realtime

SKILL.md

Category: provider

Model Studio Qwen ASR Realtime

Validation

mkdir -p output/alicloud-ai-audio-asr-realtime
python -m py_compile skills/ai/audio/alicloud-ai-audio-asr-realtime/scripts/prepare_realtime_asr_request.py && echo "py_compile_ok" > output/alicloud-ai-audio-asr-realtime/validate.txt

Pass criteria: command exits 0 and output/alicloud-ai-audio-asr-realtime/validate.txt is generated.

Output And Evidence

  • Save session payloads and response samples under output/alicloud-ai-audio-asr-realtime/.

Critical model names

Use one of these exact model strings:

  • qwen3-asr-flash-realtime
  • qwen3-asr-flash-realtime-2026-02-10

Use cases

  • Realtime subtitles and captions
  • Voice-agent duplex input
  • Streaming speech-to-text in browser or terminal clients

Prerequisites

  • Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
  • Realtime sessions generally require WebSocket or streaming session handling in the client.

Normalized interface (asr.realtime)

Request

  • model (string, optional): default qwen3-asr-flash-realtime
  • language_hints (array, optional)
  • format (string, optional): e.g. pcm, wav
  • sample_rate (int, optional): e.g. 16000
  • chunk_ms (int, optional): frame size in milliseconds

Response

  • text (string): recognized transcript fragment
  • is_final (bool): finalization marker
  • usage (object, optional)

Quick start

Generate a request template:

python skills/ai/audio/alicloud-ai-audio-asr-realtime/scripts/prepare_realtime_asr_request.py \
  --output output/alicloud-ai-audio-asr-realtime/request.json

Operational guidance

  • Prefer 16kHz mono PCM unless your client stack requires another format.
  • Keep chunks small enough for responsive partial results.
  • If you only have recorded files, use skills/ai/audio/alicloud-ai-audio-asr/ instead.

References

  • references/sources.md
Weekly Installs
26
GitHub Stars
354
First Seen
4 days ago
Installed on
gemini-cli25
github-copilot25
codex25
kimi-cli25
amp25
cline25