KnowledgeMay 11, 2026·4 min read

Deepgram Nova-3 — Production STT with 60ms Partial Latency

Deepgram Nova-3 streams partials in 60ms, finals <300ms. 36 languages, smart formatting, multilingual single-pass. Default for call centers.

Agent ready

This asset can be read and installed directly by agents

TokRepo exposes a universal CLI command, install contract, metadata JSON, adapter-aware plan, and raw content links so agents can judge fit, risk, and next actions.

Stage only · 15/100Stage only
Agent surface
Any MCP/CLI agent
Kind
Knowledge
Install
Stage only
Trust
Trust: New
Entrypoint
Asset
Universal CLI install command
npx tokrepo install 17f11669-d83b-43d2-9581-3589403ec53c
Intro

Nova-3 is Deepgram's latest production STT — 60ms partial result latency, sub-300ms final results, 36 languages, automatic punctuation, smart formatting, profanity filter, custom vocabulary. The de facto default for English call-center transcription and voice agents. Best for: phone agents, meeting recorders, live captioning, voice-controlled apps where latency dominates UX. Works with: Deepgram Python/JS/Go/Rust SDKs, REST, WebSocket streaming, OpenAI-compatible audio endpoint. Setup time: 5 minutes.


Streaming STT (Python)

import asyncio
from deepgram import DeepgramClient, LiveTranscriptionEvents, LiveOptions

dg = DeepgramClient(os.environ["DEEPGRAM_API_KEY"])

async def transcribe_mic():
    connection = dg.listen.asyncwebsocket.v("1")

    async def on_message(_, result, **kwargs):
        sentence = result.channel.alternatives[0].transcript
        if not sentence:
            return
        if result.is_final:
            print(f"FINAL: {sentence}")
        else:
            print(f"interim: {sentence}", end="\r")

    connection.on(LiveTranscriptionEvents.Transcript, on_message)

    options = LiveOptions(
        model="nova-3",
        language="multi",   # or "en", "es", "fr", etc.
        smart_format=True,
        interim_results=True,
        utterance_end_ms="1000",
        vad_events=True,
    )
    await connection.start(options)

    # feed audio bytes from mic
    async for audio_chunk in mic_audio_iterator():
        await connection.send(audio_chunk)

    await connection.finish()

asyncio.run(transcribe_mic())

Batch transcription (file)

from deepgram import PrerecordedOptions

with open("call.mp3", "rb") as f:
    response = dg.listen.prerecorded.v("1").transcribe_file(
        {"buffer": f.read()},
        PrerecordedOptions(
            model="nova-3",
            smart_format=True,
            diarize=True,
            punctuate=True,
            paragraphs=True,
            summarize="v2",
            detect_topics=True,
        ),
    )

print(response.results.channels[0].alternatives[0].transcript)

OpenAI-compatible endpoint

from openai import OpenAI
client = OpenAI(
    base_url="https://api.deepgram.com/v1",
    api_key=os.environ["DEEPGRAM_API_KEY"],
)
transcript = client.audio.transcriptions.create(
    model="nova-3",
    file=open("audio.mp3", "rb"),
)

Latency vs others (p50, streaming partial)

Provider Partial latency
Deepgram Nova-3 ~60ms
AssemblyAI Universal-2 ~150-300ms
Groq Whisper Turbo ~200ms
OpenAI Whisper-1 ~600ms (batch only)

Pricing (May 2026)

  • Streaming Nova-3: $0.0058/min
  • Batch Nova-3: $0.0043/min
  • $200 free credit on signup

FAQ

Q: Deepgram Nova-3 vs Whisper on Groq vs AssemblyAI? A: Deepgram has the lowest partial latency by 90ms+ — wins for English voice agents and call centers. Whisper-on-Groq has broader low-resource language coverage. AssemblyAI has better diarization and built-in LeMUR for transcript LLMs. Pick by primary task.

Q: Custom vocabulary for product names? A: Yes — pass keywords=['TokRepo', 'GEOScore', 'KeepRule'] in LiveOptions. Deepgram boosts these tokens during decoding so brand names transcribe correctly. Limit ~100 keywords for best results.

Q: Phone call accuracy on 8kHz audio? A: Excellent — Nova-3 trained heavily on telephony. Set encoding='mulaw', sample_rate=8000 for Twilio Media Streams. Stereo per-channel (caller/callee on different channels) hits ~99% diarization.


Quick Use

  1. pip install deepgram-sdk and get DEEPGRAM_API_KEY at console.deepgram.com
  2. Streaming: dg.listen.asyncwebsocket.v('1') + LiveOptions(model='nova-3')
  3. Batch: dg.listen.prerecorded.v('1').transcribe_file({'buffer':...}, PrerecordedOptions(model='nova-3'))

Intro

Nova-3 is Deepgram's latest production STT — 60ms partial result latency, sub-300ms final results, 36 languages, automatic punctuation, smart formatting, profanity filter, custom vocabulary. The de facto default for English call-center transcription and voice agents. Best for: phone agents, meeting recorders, live captioning, voice-controlled apps where latency dominates UX. Works with: Deepgram Python/JS/Go/Rust SDKs, REST, WebSocket streaming, OpenAI-compatible audio endpoint. Setup time: 5 minutes.


Streaming STT (Python)

import asyncio
from deepgram import DeepgramClient, LiveTranscriptionEvents, LiveOptions

dg = DeepgramClient(os.environ["DEEPGRAM_API_KEY"])

async def transcribe_mic():
    connection = dg.listen.asyncwebsocket.v("1")

    async def on_message(_, result, **kwargs):
        sentence = result.channel.alternatives[0].transcript
        if not sentence:
            return
        if result.is_final:
            print(f"FINAL: {sentence}")
        else:
            print(f"interim: {sentence}", end="\r")

    connection.on(LiveTranscriptionEvents.Transcript, on_message)

    options = LiveOptions(
        model="nova-3",
        language="multi",   # or "en", "es", "fr", etc.
        smart_format=True,
        interim_results=True,
        utterance_end_ms="1000",
        vad_events=True,
    )
    await connection.start(options)

    # feed audio bytes from mic
    async for audio_chunk in mic_audio_iterator():
        await connection.send(audio_chunk)

    await connection.finish()

asyncio.run(transcribe_mic())

Batch transcription (file)

from deepgram import PrerecordedOptions

with open("call.mp3", "rb") as f:
    response = dg.listen.prerecorded.v("1").transcribe_file(
        {"buffer": f.read()},
        PrerecordedOptions(
            model="nova-3",
            smart_format=True,
            diarize=True,
            punctuate=True,
            paragraphs=True,
            summarize="v2",
            detect_topics=True,
        ),
    )

print(response.results.channels[0].alternatives[0].transcript)

OpenAI-compatible endpoint

from openai import OpenAI
client = OpenAI(
    base_url="https://api.deepgram.com/v1",
    api_key=os.environ["DEEPGRAM_API_KEY"],
)
transcript = client.audio.transcriptions.create(
    model="nova-3",
    file=open("audio.mp3", "rb"),
)

Latency vs others (p50, streaming partial)

Provider Partial latency
Deepgram Nova-3 ~60ms
AssemblyAI Universal-2 ~150-300ms
Groq Whisper Turbo ~200ms
OpenAI Whisper-1 ~600ms (batch only)

Pricing (May 2026)

  • Streaming Nova-3: $0.0058/min
  • Batch Nova-3: $0.0043/min
  • $200 free credit on signup

FAQ

Q: Deepgram Nova-3 vs Whisper on Groq vs AssemblyAI? A: Deepgram has the lowest partial latency by 90ms+ — wins for English voice agents and call centers. Whisper-on-Groq has broader low-resource language coverage. AssemblyAI has better diarization and built-in LeMUR for transcript LLMs. Pick by primary task.

Q: Custom vocabulary for product names? A: Yes — pass keywords=['TokRepo', 'GEOScore', 'KeepRule'] in LiveOptions. Deepgram boosts these tokens during decoding so brand names transcribe correctly. Limit ~100 keywords for best results.

Q: Phone call accuracy on 8kHz audio? A: Excellent — Nova-3 trained heavily on telephony. Set encoding='mulaw', sample_rate=8000 for Twilio Media Streams. Stereo per-channel (caller/callee on different channels) hits ~99% diarization.


Source & Thanks

Built by Deepgram. API docs at developers.deepgram.com.

deepgram/deepgram-python-sdk — official SDK

🙏

Source & Thanks

Built by Deepgram. API docs at developers.deepgram.com.

deepgram/deepgram-python-sdk — official SDK

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets