TOKREPO · 主题包

STT 厂商横评 —— Deepgram / AssemblyAI / Whisper

Deepgram Nova-3(部分 60ms)/ AssemblyAI Universal-2(99 语言)/ Groq Whisper(166× 实时)—— 语音 agent 和呼叫中心实际跑的 STT 引擎。

5 个资产

STT 厂商横评 —— Deepgram / AssemblyAI / Whisper 是 TokRepo 精选主题包之一 — 把 5 个 AI 资产打包成一条命令就能装齐的整套方案。每个资产都经过验证(公开状态、最新版本可下载),按主类型归类,并确保互相搭配能跑通。

在终端跑 `tokrepo install pack/stt-providers-compared`,或者把这条命令丢给你的 AI agent — Claude Code / Cursor / Codex CLI / Gemini CLI / GitHub Copilot / Cline / Roo Code / Windsurf 全部支持。TokRepo Node CLI 会从注册表抓取每个资产,把 skill 文件放到 `.claude/skills/`,prompt 放到你的 prompt 目录,MCP server 写进客户端配置,最后给你一份安装清单。

如果不打包,你得一个一个找、验、装这 5 个资产 — 通常每个 30 到 60 分钟。主题包把这步压成一条命令,并且会跟着底层资产更新而保持鲜活。任何资产被设为私有或下架,包会在下次安装时自动跳过那条 — 不会出现「半截装」。

安装 · 一行命令
$ tokrepo install pack/stt-providers-compared
丢给 agent,或粘到终端
包内含什么

5 个资产打包就绪

Skill#01
Deepgram Nova-3 — Production STT with 60ms Partial Latency

Deepgram Nova-3 streams partials in 60ms, finals <300ms. 36 languages, smart formatting, multilingual single-pass. Default for call centers.

by Deepgram·95 views
$ tokrepo install deepgram-nova-3-production-stt-with-60ms-partial-latency
Skill#02
AssemblyAI Universal-2 — Streaming STT for Voice Agents

AssemblyAI Universal-2 is production STT with <500ms streaming latency, 99 languages, diarization, smart formatting. OpenAI-compat audio.

by AssemblyAI·109 views
$ tokrepo install assemblyai-universal-2-streaming-stt-for-voice-agents
Script#03
AssemblyAI Diarization — Auto-Identify 2-10 Speakers

AssemblyAI speaker_labels separates 2-10 speakers without enrollment. Per-utterance speaker tags. For meetings, interviews, multi-party calls.

by AssemblyAI·103 views
$ tokrepo install assemblyai-diarization-auto-identify-2-10-speakers
Skill#04
LeMUR — Run LLMs Over AssemblyAI Transcripts

LeMUR runs Claude / GPT prompts over AssemblyAI transcripts already in context. Summaries, Q&A, action items, custom JSON extraction.

by AssemblyAI·104 views
$ tokrepo install lemur-run-llms-over-assemblyai-transcripts
Skill#05
Groq Whisper — Sub-Second Speech-to-Text for Voice Agents

Whisper-large-v3 on Groq runs 166× realtime — 60-sec clip in <400ms. OpenAI-compat audio.transcriptions endpoint for voice agents.

by Groq·127 views
$ tokrepo install groq-whisper-sub-second-speech-to-text-for-voice-agents
更多主题包

12 个主题包 · 80+ 精选资产

回首页浏览全部精选合集

返回主题包总览