What is LeMUR — Run LLMs Over AssemblyAI Transcripts?

LeMUR runs Claude / GPT prompts over AssemblyAI transcripts already in context. Summaries, Q&A, action items, custom JSON extraction.

Is LeMUR — Run LLMs Over AssemblyAI Transcripts free to use?

Yes. LeMUR — Run LLMs Over AssemblyAI Transcripts is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install LeMUR — Run LLMs Over AssemblyAI Transcripts?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

LeMUR — Run LLMs Over AssemblyAI Transcripts

Name: LeMUR — Run LLMs Over AssemblyAI Transcripts
Author: AssemblyAI

import assemblyai as aai aai.settings.api_key = ASSEMBLYAI_KEY transcript = aai.Transcriber().transcribe("call.mp3") summary = transcript.lemur.summarize( final_model=aai.LemurModel.claude3_5_sonnet, context="This is a customer support call about a missed refund.", answer_format="3 bullet points", ) print(summary.response)

prompt = ''' You are a call center QA analyst. Score this support call on: - Empathy (0-10) - Resolution clarity (0-10) - Compliance: was the agent's name stated, was a case number provided? Return strict JSON with these fields plus a 'notes' string under 200 words. ''' result = transcript.lemur.task( prompt=prompt, final_model=aai.LemurModel.claude3_5_sonnet, temperature=0.0, max_output_size=600, ) import json print(json.loads(result.response))

qa = transcript.lemur.question( questions=[ aai.LemurQuestion(question="What was the customer's main complaint?"), aai.LemurQuestion(question="Did the agent offer a refund? If yes, how much?"), aai.LemurQuestion(question="What's the recommended next action?", answer_format="one sentence"), ], final_model=aai.LemurModel.claude3_5_sonnet, ) for r in qa.response: print(r.question, "→", r.answer)

Model

Best for

claude3_5_sonnet

Default — best quality, balanced cost

claude3_haiku

Cheap, fast for short summaries

claude3_opus

Top quality, slowest, highest cost

default

AssemblyAI-tuned fast model

Quick Use

Transcribe with aai.Transcriber().transcribe(...)
Call transcript.lemur.summarize / question / task / action_items
Pick final_model per cost/quality tradeoff

Intro

LeMUR (Leveraging Large Language Models to Understand Recognized Speech) is AssemblyAI's transcript-LLM bridge — once a transcript exists in your account, you can run Claude or GPT prompts against it without re-uploading or chunking. Endpoints: summary, Q&A, action items, custom prompt. Best for: meeting recap automation, call center QA, podcast show notes, any post-transcription analysis. Works with: assemblyai Python/Node SDK + LeMUR HTTP endpoints. Setup time: 5 minutes after a transcript exists.

Summary endpoint

import assemblyai as aai
aai.settings.api_key = ASSEMBLYAI_KEY

transcript = aai.Transcriber().transcribe("call.mp3")

summary = transcript.lemur.summarize(
    final_model=aai.LemurModel.claude3_5_sonnet,
    context="This is a customer support call about a missed refund.",
    answer_format="3 bullet points",
)
print(summary.response)

Custom prompt (most flexible)

prompt = '''
You are a call center QA analyst. Score this support call on:
- Empathy (0-10)
- Resolution clarity (0-10)
- Compliance: was the agent's name stated, was a case number provided?

Return strict JSON with these fields plus a 'notes' string under 200 words.
'''

result = transcript.lemur.task(
    prompt=prompt,
    final_model=aai.LemurModel.claude3_5_sonnet,
    temperature=0.0,
    max_output_size=600,
)
import json
print(json.loads(result.response))

Q&A endpoint (multi-question)

qa = transcript.lemur.question(
    questions=[
        aai.LemurQuestion(question="What was the customer's main complaint?"),
        aai.LemurQuestion(question="Did the agent offer a refund? If yes, how much?"),
        aai.LemurQuestion(question="What's the recommended next action?", answer_format="one sentence"),
    ],
    final_model=aai.LemurModel.claude3_5_sonnet,
)
for r in qa.response:
    print(r.question, "→", r.answer)

Action items

action_items = transcript.lemur.action_items(
    final_model=aai.LemurModel.claude3_5_sonnet,
    context="Internal product planning meeting.",
)
print(action_items.response)

Available models

Model	Best for
`claude3_5_sonnet`	Default — best quality, balanced cost
`claude3_haiku`	Cheap, fast for short summaries
`claude3_opus`	Top quality, slowest, highest cost
`default`	AssemblyAI-tuned fast model

FAQ

Q: Why use LeMUR instead of feeding transcript to Claude myself? A: Three reasons: (1) the transcript stays in AssemblyAI's secure data plane — no re-upload of potentially-PII content; (2) you skip the chunking + context management plumbing; (3) it's one billing invoice. For one-off scripts, calling Claude directly is fine; for production analyze-every-call flows, LeMUR is simpler.

Q: Can I run LeMUR on multiple transcripts at once? A: Yes — aai.Lemur().task(transcript_ids=[id1, id2, id3], prompt=...). Useful for weekly call-portfolio analysis. 100 transcripts max per call.

Q: Does LeMUR support tool calls? A: Not yet — LeMUR is text-in/text-out. For tool use, fetch the transcript, then pass it to your own Claude/OpenAI call with tools enabled.

Source & Thanks

Built by AssemblyAI. LeMUR docs at assemblyai.com/docs/lemur.

AssemblyAI/assemblyai-python-sdk

LeMUR — Run LLMs Over AssemblyAI Transcripts

This asset can be read and installed directly by agents

Summary endpoint

Custom prompt (most flexible)

Q&A endpoint (multi-question)

Action items

Available models

FAQ

Quick Use

Intro

Summary endpoint

Custom prompt (most flexible)

Q&A endpoint (multi-question)

Action items

Available models

FAQ

Source & Thanks

Source & Thanks

Discussion

Related Assets

Ollama — Run LLMs Locally

GPT4All — Run LLMs Privately on Your Desktop

llama.cpp — Run LLMs Locally in Pure C/C++

Petals — Run LLMs at Home BitTorrent-Style