audio-transcription
Audio Transcription
Follow shared public skill rules in:
postplus-sharedpublic skill rules
Use this skill when the input is audio and the main job is:
- transcript generation
- subtitle-ready timing
- rough speech search
- multilingual audio transcription
This skill is not for video semantic understanding.
Hosted Endpoint
First-version hosted transcription endpoints:
- hosted transcription capability
transcription-whispertranscription-whisper-turbo
Use transcription-whisper by default when subtitle quality matters.
Use transcription-whisper-turbo when:
- the user wants a cheaper rough pass
- timestamps are not the primary requirement
Output Contract
Persist:
request.jsonresponse.jsonmanifest.json- downloaded provider outputs under
outputs/
Do not rely on the provider dashboard as the durable record.
Poll Behavior
Hosted transcription is asynchronous. The script polls the prediction result URL until
status is completed or failed. Default poll window: 150 attempts × 2 s = 5 minutes.
Short audio clips typically complete in under 30 s. If a job exceeds 5 minutes, retry rather than increasing the timeout further.
Before submission, the script logs a polling preflight line from
durationSeconds. Audio at or above 300 seconds is marked as possibly exceeding
the current polling window. That warning is informational, not a hidden fallback:
the script still uses the same 5-minute poll contract and fails on timeout.
Default Workflow
- Normalize the transcription request.
- Log the 5-minute polling preflight from
durationSeconds. - Submit to hosted Whisper capability.
- Save raw request and response locally.
- Poll if the job is asynchronous.
- Save downloaded transcript artifacts locally.
- Hand off to
subtitle-packagerif SRT/VTT is needed.
Scripts
scripts/transcribe_audio.mjsscripts/poll_transcription.mjs
Read These References
references/tool-contracts.md
Public Skill Execution Contract
- keep transcription requests, provider responses, manifests, and downloaded
transcript artifacts under
<work-folder>/.postplus/audio-transcription/ - keep only final user-facing transcript exports outside
.postplus/ - start with a bounded first pass, usually one source file before larger batches
- if hosted transcription capability is unavailable, unauthorized, or returns a stable network error, stop immediately instead of switching to ad hoc shell glue
More from postplusai/postplus-skills
google-trends-research
Research Google Trends search-intent signals for topic discovery, keyword momentum, regional interest, and rising queries without treating search trends as the same thing as platform content heat or marketplace demand.
91seedance-submitter
Use when preparing, submitting, polling, or debugging Seedance 2.0 video generation jobs from product images, storyboard images, UGC scripts, voiceover copy, or promptPlan request JSON. Use for splitting scripts into render segments, uploading references, creating request JSON, submitting jobs through the PostPlus Cloud service, polling predictions, and handing off local render paths.
89social-media-publisher
Prepare and, after explicit approval, publish social posts through the PostPlus platform-owned social publishing workspace.
89facebook-research
Research Facebook pages, public follower or following surfaces, and public posts using PostPlus Cloud collection service. Use this when the user wants Facebook account research, follower-surface sampling, or public post metrics.
89x-tools
Local execution tools for X/Twitter hosted collection workflows, including actor runs, dataset normalization, tweet ranking, account ranking, audience graph construction, and language clustering.
88tiktok-ad-research
Research TikTok Creative Center or ad-library style datasets for winning ad patterns, regions, objectives, hook language, and creative signals without mixing paid ads with organic creator discovery.
88