Is LeMUR — Run LLMs Over AssemblyAI Transcripts free to use?

Yes. LeMUR — Run LLMs Over AssemblyAI Transcripts is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install LeMUR — Run LLMs Over AssemblyAI Transcripts?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Esta página se muestra en inglés. Una traducción al español está en curso.

SkillsMay 11, 2026·4 min de lectura

LeMUR — Run LLMs Over AssemblyAI Transcripts

LeMUR runs Claude / GPT prompts over AssemblyAI transcripts already in context. Summaries, Q&A, action items, custom JSON extraction.

AssemblyAI · Community

Listo para agents

Instalación con revisión previa

Este activo requiere revisión. El prompt copiado pide dry-run, muestra escrituras y continúa solo tras confirmación.

Needs Confirmation · 66/100Política: confirmar

Superficie agent

Cualquier agent MCP/CLI

Tipo

Skill

Instalación

Single

Confianza

Confianza: Community

Entrada

Asset

Comando con revisión previa

npx -y tokrepo@latest install bf97b4c4-021f-4912-afc9-fbba48bc48b2 --target codex

Primero dry-run, confirma las escrituras y luego ejecuta este comando.

Introducción

LeMUR (Leveraging Large Language Models to Understand Recognized Speech) is AssemblyAI's transcript-LLM bridge — once a transcript exists in your account, you can run Claude or GPT prompts against it without re-uploading or chunking. Endpoints: summary, Q&A, action items, custom prompt. Best for: meeting recap automation, call center QA, podcast show notes, any post-transcription analysis. Works with: assemblyai Python/Node SDK + LeMUR HTTP endpoints. Setup time: 5 minutes after a transcript exists.

Summary endpoint

import assemblyai as aai
aai.settings.api_key = ASSEMBLYAI_KEY

transcript = aai.Transcriber().transcribe("call.mp3")

summary = transcript.lemur.summarize(
    final_model=aai.LemurModel.claude3_5_sonnet,
    context="This is a customer support call about a missed refund.",
    answer_format="3 bullet points",
)
print(summary.response)

Custom prompt (most flexible)

prompt = '''
You are a call center QA analyst. Score this support call on:
- Empathy (0-10)
- Resolution clarity (0-10)
- Compliance: was the agent's name stated, was a case number provided?

Return strict JSON with these fields plus a 'notes' string under 200 words.
'''

result = transcript.lemur.task(
    prompt=prompt,
    final_model=aai.LemurModel.claude3_5_sonnet,
    temperature=0.0,
    max_output_size=600,
)
import json
print(json.loads(result.response))

Q&A endpoint (multi-question)

qa = transcript.lemur.question(
    questions=[
        aai.LemurQuestion(question="What was the customer's main complaint?"),
        aai.LemurQuestion(question="Did the agent offer a refund? If yes, how much?"),
        aai.LemurQuestion(question="What's the recommended next action?", answer_format="one sentence"),
    ],
    final_model=aai.LemurModel.claude3_5_sonnet,
)
for r in qa.response:
    print(r.question, "→", r.answer)

Action items

action_items = transcript.lemur.action_items(
    final_model=aai.LemurModel.claude3_5_sonnet,
    context="Internal product planning meeting.",
)
print(action_items.response)

Available models

Model	Best for
`claude3_5_sonnet`	Default — best quality, balanced cost
`claude3_haiku`	Cheap, fast for short summaries
`claude3_opus`	Top quality, slowest, highest cost
`default`	AssemblyAI-tuned fast model

FAQ

Q: Why use LeMUR instead of feeding transcript to Claude myself? A: Three reasons: (1) the transcript stays in AssemblyAI's secure data plane — no re-upload of potentially-PII content; (2) you skip the chunking + context management plumbing; (3) it's one billing invoice. For one-off scripts, calling Claude directly is fine; for production analyze-every-call flows, LeMUR is simpler.

Q: Can I run LeMUR on multiple transcripts at once? A: Yes — aai.Lemur().task(transcript_ids=[id1, id2, id3], prompt=...). Useful for weekly call-portfolio analysis. 100 transcripts max per call.

Q: Does LeMUR support tool calls? A: Not yet — LeMUR is text-in/text-out. For tool use, fetch the transcript, then pass it to your own Claude/OpenAI call with tools enabled.

Quick Use

Transcribe with aai.Transcriber().transcribe(...)
Call transcript.lemur.summarize / question / task / action_items
Pick final_model per cost/quality tradeoff

Intro

Summary endpoint

import assemblyai as aai
aai.settings.api_key = ASSEMBLYAI_KEY

transcript = aai.Transcriber().transcribe("call.mp3")

summary = transcript.lemur.summarize(
    final_model=aai.LemurModel.claude3_5_sonnet,
    context="This is a customer support call about a missed refund.",
    answer_format="3 bullet points",
)
print(summary.response)

Custom prompt (most flexible)

prompt = '''
You are a call center QA analyst. Score this support call on:
- Empathy (0-10)
- Resolution clarity (0-10)
- Compliance: was the agent's name stated, was a case number provided?

Return strict JSON with these fields plus a 'notes' string under 200 words.
'''

result = transcript.lemur.task(
    prompt=prompt,
    final_model=aai.LemurModel.claude3_5_sonnet,
    temperature=0.0,
    max_output_size=600,
)
import json
print(json.loads(result.response))

Q&A endpoint (multi-question)

qa = transcript.lemur.question(
    questions=[
        aai.LemurQuestion(question="What was the customer's main complaint?"),
        aai.LemurQuestion(question="Did the agent offer a refund? If yes, how much?"),
        aai.LemurQuestion(question="What's the recommended next action?", answer_format="one sentence"),
    ],
    final_model=aai.LemurModel.claude3_5_sonnet,
)
for r in qa.response:
    print(r.question, "→", r.answer)

Action items

action_items = transcript.lemur.action_items(
    final_model=aai.LemurModel.claude3_5_sonnet,
    context="Internal product planning meeting.",
)
print(action_items.response)

Available models

Model	Best for
`claude3_5_sonnet`	Default — best quality, balanced cost
`claude3_haiku`	Cheap, fast for short summaries
`claude3_opus`	Top quality, slowest, highest cost
`default`	AssemblyAI-tuned fast model

FAQ

Q: Does LeMUR support tool calls? A: Not yet — LeMUR is text-in/text-out. For tool use, fetch the transcript, then pass it to your own Claude/OpenAI call with tools enabled.

Source & Thanks

Built by AssemblyAI. LeMUR docs at assemblyai.com/docs/lemur.

AssemblyAI/assemblyai-python-sdk

🙏

Fuente y agradecimientos

Built by AssemblyAI. LeMUR docs at assemblyai.com/docs/lemur.

AssemblyAI/assemblyai-python-sdk

Discusión

Inicia sesión para unirte a la discusión.

Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados

Jan — Run AI Models Locally on Your Desktop

Open-source desktop app to run LLMs offline. Jan supports Llama, Mistral, and Gemma models with one-click download, OpenAI-compatible API, and full privacy.

Skills

Skill Factory

Petals — Run LLMs at Home BitTorrent-Style

A decentralized system for running large language models collaboratively across consumer hardware. Distributes model layers across peers for inference and fine-tuning.

Skills

AI Open Source

LocalAI — Run Any AI Model Locally, No GPU

LocalAI is an open-source AI engine running LLMs, vision, voice, and image models locally. 44.6K+ GitHub stars. OpenAI/Anthropic-compatible API, 35+ backends, MCP, agents. MIT licensed.

Skills

AI Open Source

LlamaFactory — Unified Fine-Tuning for 100+ LLMs

An open-source framework that unifies efficient fine-tuning methods for over 100 large language models including LLaMA, Qwen, Mistral, and more, with a web UI and CLI.

Skills

Script Depot