deepgram-js-audio-intelligence
Using Deepgram Audio Intelligence (JavaScript / TypeScript SDK)
Analytics overlays applied to /v1/listen: summaries, topics, intents, sentiment, language detection, diarization, redaction, entities. Same client surface as STT; turn features on with parameters.
When to use this product
- You have audio and want analytics returned alongside the transcript.
- REST is the primary path; the WebSocket path supports only a subset of intelligence features.
Use a different skill when:
- You just want transcript output →
deepgram-js-speech-to-text. - You already have text and want analytics on that text →
deepgram-js-text-intelligence. - You need Flux turn-taking →
deepgram-js-conversational-stt. - You need a full interactive voice agent →
deepgram-js-voice-agent.
Feature availability: REST vs WSS
| Feature | REST | WSS |
|---|---|---|
diarize |
yes | yes |
redact |
yes | yes |
detect_entities |
yes | yes |
punctuate, smart_format |
yes | yes |
summarize |
yes | no in current WSS connect args |
topics |
yes | no |
intents |
yes | no |
sentiment |
yes | no |
detect_language |
yes | no |
Authentication
require("dotenv").config();
const { DeepgramClient } = require("@deepgram/sdk");
const deepgramClient = new DeepgramClient({
apiKey: process.env.DEEPGRAM_API_KEY,
});
Quick start — REST with analytics
From examples/22-transcription-advanced-options.ts:
const data = await deepgramClient.listen.v1.media.transcribeUrl({
url: "https://dpgr.am/spacewalk.wav",
model: "nova-3",
language: "en",
punctuate: true,
paragraphs: true,
utterances: true,
smart_format: true,
sentiment: true,
topics: true,
custom_topic: "custom_topic",
custom_topic_mode: "extended",
intents: true,
custom_intent: "custom_intent",
custom_intent_mode: "extended",
detect_entities: true,
detect_language: true,
diarize: true,
keyterm: ["keyword1", "keyword2"],
redact: ["pci", "ssn"],
});
Quick start — WSS subset
Start from examples/07-transcription-live-websocket.ts and keep the same socket flow, but only use WSS-supported intelligence flags such as diarize, redact, and detect_entities in the connection args.
const deepgramConnection = await deepgramClient.listen.v1.createConnection({
model: "nova-3",
diarize: true,
redact: "pci",
detect_entities: true,
});
Key parameters / API surface
- Analytics flags:
summarize,topics,intents,sentiment,detect_language,detect_entities,diarize,redact,custom_topic,custom_topic_mode,custom_intent,custom_intent_mode. - Standard STT flags still apply:
model,language,encoding,sample_rate,punctuate,smart_format,utterances,paragraphs,multichannel. - Nova-3-specific biasing in repo examples uses
keyterm, notkeywords.
API reference (layered)
- In-repo reference:
reference.md→Listen V1 Media; WSS subset behavior lives insrc/CustomClient.tsandsrc/api/resources/listen/resources/v1/client/{Client,Socket}.ts. - Canonical OpenAPI (REST): https://developers.deepgram.com/openapi.yaml
- Canonical AsyncAPI (WSS): https://developers.deepgram.com/asyncapi.yaml
- Context7: library ID
/llmstxt/developers_deepgram_llms_txt - Product docs:
- https://developers.deepgram.com/docs/stt-intelligence-feature-overview
- https://developers.deepgram.com/docs/summarization
- https://developers.deepgram.com/docs/topic-detection
- https://developers.deepgram.com/docs/intent-recognition
- https://developers.deepgram.com/docs/sentiment-analysis
- https://developers.deepgram.com/docs/language-detection
- https://developers.deepgram.com/docs/redaction
- https://developers.deepgram.com/docs/diarization
Gotchas
summarizeon/v1/listenis versioned, not plain boolean. The generated REST surface and examples point at"v2".- Most intelligence flags are REST-only. Current WSS connect args do not expose
topics,intents,sentiment,summarize, ordetect_language. redacttyping is looser in practice than in the generated alias. Examples pass arrays like["pci", "ssn"], even thoughListenV1Redactitself is just a string alias.- Use
keytermfor Nova-3 biasing.examples/22-transcription-advanced-options.tsexplicitly notes keywords are not supported for Nova-3. - Model/feature support is product-side.
nova-3is the safest choice when mixing many overlays. - Diarization quality depends on audio quality and duration. Short or noisy clips churn speakers.
Example files in this repo
examples/22-transcription-advanced-options.tsexamples/04-transcription-prerecorded-url.tsexamples/05-transcription-prerecorded-file.tsexamples/07-transcription-live-websocket.ts
Central product skills
For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
npx skills add deepgram/skills
This SDK ships language-idiomatic code skills; deepgram/skills ships cross-language product knowledge (see api, docs, recipes, examples, starters, setup-mcp).