STT 厂商横评 —— Deepgram / AssemblyAI / Whisper

Deepgram Nova-3（部分 60ms）/ AssemblyAI Universal-2（99 语言）/ Groq Whisper（166× 实时）—— 语音 agent 和呼叫中心实际跑的 STT 引擎。

5 个资产

关于这个主题包

STT 厂商横评 —— Deepgram / AssemblyAI / Whisper 是 TokRepo 精选主题包之一 — 把 5 个 AI 资产打包成一条命令就能装齐的整套方案。每个资产都经过验证（公开状态、最新版本可下载），按主类型归类，并确保互相搭配能跑通。

在终端跑 `tokrepo install pack/stt-providers-compared`，或者把这条命令丢给你的 AI agent — Claude Code / Cursor / Codex CLI / Gemini CLI / GitHub Copilot / Cline / Roo Code / Windsurf 全部支持。TokRepo Node CLI 会从注册表抓取每个资产，把 skill 文件放到 `.claude/skills/`，prompt 放到你的 prompt 目录，MCP server 写进客户端配置，最后给你一份安装清单。

如果不打包，你得一个一个找、验、装这 5 个资产 — 通常每个 30 到 60 分钟。主题包把这步压成一条命令，并且会跟着底层资产更新而保持鲜活。任何资产被设为私有或下架，包会在下次安装时自动跳过那条 — 不会出现「半截装」。

安装 · 一行命令

$ tokrepo install pack/stt-providers-compared

丢给 agent，或粘到终端

包内含什么

5 个资产打包就绪

Skill#01

Deepgram Nova-3 — Production STT with 60ms Partial Latency

Deepgram Nova-3 streams partials in 60ms, finals <300ms. 36 languages, smart formatting, multilingual single-pass. Default for call centers.

by Deepgram·237 views

$ tokrepo install deepgram-nova-3-production-stt-with-60ms-partial-latency

Skill#02

AssemblyAI Universal-2 — Streaming STT for Voice Agents

AssemblyAI Universal-2 is production STT with <500ms streaming latency, 99 languages, diarization, smart formatting. OpenAI-compat audio.

by AssemblyAI·245 views

$ tokrepo install assemblyai-universal-2-streaming-stt-for-voice-agents

Script#03

AssemblyAI Diarization — Auto-Identify 2-10 Speakers

AssemblyAI speaker_labels separates 2-10 speakers without enrollment. Per-utterance speaker tags. For meetings, interviews, multi-party calls.

by AssemblyAI·229 views

$ tokrepo install assemblyai-diarization-auto-identify-2-10-speakers

Skill#04

LeMUR — Run LLMs Over AssemblyAI Transcripts

LeMUR runs Claude / GPT prompts over AssemblyAI transcripts already in context. Summaries, Q&A, action items, custom JSON extraction.

by AssemblyAI·222 views

$ tokrepo install lemur-run-llms-over-assemblyai-transcripts

Skill#05

Groq Whisper — Sub-Second Speech-to-Text for Voice Agents

Whisper-large-v3 on Groq runs 166× realtime — 60-sec clip in <400ms. OpenAI-compat audio.transcriptions endpoint for voice agents.

by Groq·360 views

$ tokrepo install groq-whisper-sub-second-speech-to-text-for-voice-agents

更多主题包

12 个主题包 · 80+ 精选资产

回首页浏览全部精选合集

返回主题包总览