KnowledgeMay 11, 2026·4 min read

Deepgram Nova-3 — Production STT with 60ms Partial Latency

Deepgram Nova-3 streams partials in 60ms, finals <300ms. 36 languages, smart formatting, multilingual single-pass. Default for call centers.

Agent ready

Safe staging for this asset

This asset is staged first. The copied prompt tells the agent to inspect the staged files and ask before activating scripts, MCP config, or global config.

Stage only · 27/100Policy: stage
Agent surface
Any MCP/CLI agent
Kind
Knowledge
Install
Stage only
Trust
Trust: Community
Entrypoint
Asset
Safe staging command
npx -y tokrepo@latest install 17f11669-d83b-43d2-9581-3589403ec53c --target codex

Stages files first; activation requires review of the staged README and plan.

Intro

Nova-3 is Deepgram's latest production STT — 60ms partial result latency, sub-300ms final results, 36 languages, automatic punctuation, smart formatting, profanity filter, custom vocabulary. The de facto default for English call-center transcription and voice agents. Best for: phone agents, meeting recorders, live captioning, voice-controlled apps where latency dominates UX. Works with: Deepgram Python/JS/Go/Rust SDKs, REST, WebSocket streaming, OpenAI-compatible audio endpoint. Setup time: 5 minutes.


Streaming STT (Python)

import asyncio
from deepgram import DeepgramClient, LiveTranscriptionEvents, LiveOptions

dg = DeepgramClient(os.environ["DEEPGRAM_API_KEY"])

async def transcribe_mic():
    connection = dg.listen.asyncwebsocket.v("1")

    async def on_message(_, result, **kwargs):
        sentence = result.channel.alternatives[0].transcript
        if not sentence:
            return
        if result.is_final:
            print(f"FINAL: {sentence}")
        else:
            print(f"interim: {sentence}", end="\r")

    connection.on(LiveTranscriptionEvents.Transcript, on_message)

    options = LiveOptions(
        model="nova-3",
        language="multi",   # or "en", "es", "fr", etc.
        smart_format=True,
        interim_results=True,
        utterance_end_ms="1000",
        vad_events=True,
    )
    await connection.start(options)

    # feed audio bytes from mic
    async for audio_chunk in mic_audio_iterator():
        await connection.send(audio_chunk)

    await connection.finish()

asyncio.run(transcribe_mic())

Batch transcription (file)

from deepgram import PrerecordedOptions

with open("call.mp3", "rb") as f:
    response = dg.listen.prerecorded.v("1").transcribe_file(
        {"buffer": f.read()},
        PrerecordedOptions(
            model="nova-3",
            smart_format=True,
            diarize=True,
            punctuate=True,
            paragraphs=True,
            summarize="v2",
            detect_topics=True,
        ),
    )

print(response.results.channels[0].alternatives[0].transcript)

OpenAI-compatible endpoint

from openai import OpenAI
client = OpenAI(
    base_url="https://api.deepgram.com/v1",
    api_key=os.environ["DEEPGRAM_API_KEY"],
)
transcript = client.audio.transcriptions.create(
    model="nova-3",
    file=open("audio.mp3", "rb"),
)

Latency vs others (p50, streaming partial)

Provider Partial latency
Deepgram Nova-3 ~60ms
AssemblyAI Universal-2 ~150-300ms
Groq Whisper Turbo ~200ms
OpenAI Whisper-1 ~600ms (batch only)

Pricing (May 2026)

  • Streaming Nova-3: $0.0058/min
  • Batch Nova-3: $0.0043/min
  • $200 free credit on signup

FAQ

Q: Deepgram Nova-3 vs Whisper on Groq vs AssemblyAI? A: Deepgram has the lowest partial latency by 90ms+ — wins for English voice agents and call centers. Whisper-on-Groq has broader low-resource language coverage. AssemblyAI has better diarization and built-in LeMUR for transcript LLMs. Pick by primary task.

Q: Custom vocabulary for product names? A: Yes — pass keywords=['TokRepo', 'GEOScore', 'KeepRule'] in LiveOptions. Deepgram boosts these tokens during decoding so brand names transcribe correctly. Limit ~100 keywords for best results.

Q: Phone call accuracy on 8kHz audio? A: Excellent — Nova-3 trained heavily on telephony. Set encoding='mulaw', sample_rate=8000 for Twilio Media Streams. Stereo per-channel (caller/callee on different channels) hits ~99% diarization.


Quick Use

  1. pip install deepgram-sdk and get DEEPGRAM_API_KEY at console.deepgram.com
  2. Streaming: dg.listen.asyncwebsocket.v('1') + LiveOptions(model='nova-3')
  3. Batch: dg.listen.prerecorded.v('1').transcribe_file({'buffer':...}, PrerecordedOptions(model='nova-3'))

Intro

Nova-3 is Deepgram's latest production STT — 60ms partial result latency, sub-300ms final results, 36 languages, automatic punctuation, smart formatting, profanity filter, custom vocabulary. The de facto default for English call-center transcription and voice agents. Best for: phone agents, meeting recorders, live captioning, voice-controlled apps where latency dominates UX. Works with: Deepgram Python/JS/Go/Rust SDKs, REST, WebSocket streaming, OpenAI-compatible audio endpoint. Setup time: 5 minutes.


Streaming STT (Python)

import asyncio
from deepgram import DeepgramClient, LiveTranscriptionEvents, LiveOptions

dg = DeepgramClient(os.environ["DEEPGRAM_API_KEY"])

async def transcribe_mic():
    connection = dg.listen.asyncwebsocket.v("1")

    async def on_message(_, result, **kwargs):
        sentence = result.channel.alternatives[0].transcript
        if not sentence:
            return
        if result.is_final:
            print(f"FINAL: {sentence}")
        else:
            print(f"interim: {sentence}", end="\r")

    connection.on(LiveTranscriptionEvents.Transcript, on_message)

    options = LiveOptions(
        model="nova-3",
        language="multi",   # or "en", "es", "fr", etc.
        smart_format=True,
        interim_results=True,
        utterance_end_ms="1000",
        vad_events=True,
    )
    await connection.start(options)

    # feed audio bytes from mic
    async for audio_chunk in mic_audio_iterator():
        await connection.send(audio_chunk)

    await connection.finish()

asyncio.run(transcribe_mic())

Batch transcription (file)

from deepgram import PrerecordedOptions

with open("call.mp3", "rb") as f:
    response = dg.listen.prerecorded.v("1").transcribe_file(
        {"buffer": f.read()},
        PrerecordedOptions(
            model="nova-3",
            smart_format=True,
            diarize=True,
            punctuate=True,
            paragraphs=True,
            summarize="v2",
            detect_topics=True,
        ),
    )

print(response.results.channels[0].alternatives[0].transcript)

OpenAI-compatible endpoint

from openai import OpenAI
client = OpenAI(
    base_url="https://api.deepgram.com/v1",
    api_key=os.environ["DEEPGRAM_API_KEY"],
)
transcript = client.audio.transcriptions.create(
    model="nova-3",
    file=open("audio.mp3", "rb"),
)

Latency vs others (p50, streaming partial)

Provider Partial latency
Deepgram Nova-3 ~60ms
AssemblyAI Universal-2 ~150-300ms
Groq Whisper Turbo ~200ms
OpenAI Whisper-1 ~600ms (batch only)

Pricing (May 2026)

  • Streaming Nova-3: $0.0058/min
  • Batch Nova-3: $0.0043/min
  • $200 free credit on signup

FAQ

Q: Deepgram Nova-3 vs Whisper on Groq vs AssemblyAI? A: Deepgram has the lowest partial latency by 90ms+ — wins for English voice agents and call centers. Whisper-on-Groq has broader low-resource language coverage. AssemblyAI has better diarization and built-in LeMUR for transcript LLMs. Pick by primary task.

Q: Custom vocabulary for product names? A: Yes — pass keywords=['TokRepo', 'GEOScore', 'KeepRule'] in LiveOptions. Deepgram boosts these tokens during decoding so brand names transcribe correctly. Limit ~100 keywords for best results.

Q: Phone call accuracy on 8kHz audio? A: Excellent — Nova-3 trained heavily on telephony. Set encoding='mulaw', sample_rate=8000 for Twilio Media Streams. Stereo per-channel (caller/callee on different channels) hits ~99% diarization.


Source & Thanks

Built by Deepgram. API docs at developers.deepgram.com.

deepgram/deepgram-python-sdk — official SDK

🙏

Source & Thanks

Built by Deepgram. API docs at developers.deepgram.com.

deepgram/deepgram-python-sdk — official SDK

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets