SkillsMay 11, 2026·4 min read

LeMUR — Run LLMs Over AssemblyAI Transcripts

LeMUR runs Claude / GPT prompts over AssemblyAI transcripts already in context. Summaries, Q&A, action items, custom JSON extraction.

Agent ready

This asset can be read and installed directly by agents

TokRepo exposes a universal CLI command, install contract, metadata JSON, adapter-aware plan, and raw content links so agents can judge fit, risk, and next actions.

Needs Confirmation · 66/100Policy: confirm
Agent surface
Any MCP/CLI agent
Kind
Skill
Install
Single
Trust
Trust: New
Entrypoint
Asset
Universal CLI install command
npx tokrepo install bf97b4c4-021f-4912-afc9-fbba48bc48b2
Intro

LeMUR (Leveraging Large Language Models to Understand Recognized Speech) is AssemblyAI's transcript-LLM bridge — once a transcript exists in your account, you can run Claude or GPT prompts against it without re-uploading or chunking. Endpoints: summary, Q&A, action items, custom prompt. Best for: meeting recap automation, call center QA, podcast show notes, any post-transcription analysis. Works with: assemblyai Python/Node SDK + LeMUR HTTP endpoints. Setup time: 5 minutes after a transcript exists.


Summary endpoint

import assemblyai as aai
aai.settings.api_key = ASSEMBLYAI_KEY

transcript = aai.Transcriber().transcribe("call.mp3")

summary = transcript.lemur.summarize(
    final_model=aai.LemurModel.claude3_5_sonnet,
    context="This is a customer support call about a missed refund.",
    answer_format="3 bullet points",
)
print(summary.response)

Custom prompt (most flexible)

prompt = '''
You are a call center QA analyst. Score this support call on:
- Empathy (0-10)
- Resolution clarity (0-10)
- Compliance: was the agent's name stated, was a case number provided?

Return strict JSON with these fields plus a 'notes' string under 200 words.
'''

result = transcript.lemur.task(
    prompt=prompt,
    final_model=aai.LemurModel.claude3_5_sonnet,
    temperature=0.0,
    max_output_size=600,
)
import json
print(json.loads(result.response))

Q&A endpoint (multi-question)

qa = transcript.lemur.question(
    questions=[
        aai.LemurQuestion(question="What was the customer's main complaint?"),
        aai.LemurQuestion(question="Did the agent offer a refund? If yes, how much?"),
        aai.LemurQuestion(question="What's the recommended next action?", answer_format="one sentence"),
    ],
    final_model=aai.LemurModel.claude3_5_sonnet,
)
for r in qa.response:
    print(r.question, "→", r.answer)

Action items

action_items = transcript.lemur.action_items(
    final_model=aai.LemurModel.claude3_5_sonnet,
    context="Internal product planning meeting.",
)
print(action_items.response)

Available models

Model Best for
claude3_5_sonnet Default — best quality, balanced cost
claude3_haiku Cheap, fast for short summaries
claude3_opus Top quality, slowest, highest cost
default AssemblyAI-tuned fast model

FAQ

Q: Why use LeMUR instead of feeding transcript to Claude myself? A: Three reasons: (1) the transcript stays in AssemblyAI's secure data plane — no re-upload of potentially-PII content; (2) you skip the chunking + context management plumbing; (3) it's one billing invoice. For one-off scripts, calling Claude directly is fine; for production analyze-every-call flows, LeMUR is simpler.

Q: Can I run LeMUR on multiple transcripts at once? A: Yes — aai.Lemur().task(transcript_ids=[id1, id2, id3], prompt=...). Useful for weekly call-portfolio analysis. 100 transcripts max per call.

Q: Does LeMUR support tool calls? A: Not yet — LeMUR is text-in/text-out. For tool use, fetch the transcript, then pass it to your own Claude/OpenAI call with tools enabled.


Quick Use

  1. Transcribe with aai.Transcriber().transcribe(...)
  2. Call transcript.lemur.summarize / question / task / action_items
  3. Pick final_model per cost/quality tradeoff

Intro

LeMUR (Leveraging Large Language Models to Understand Recognized Speech) is AssemblyAI's transcript-LLM bridge — once a transcript exists in your account, you can run Claude or GPT prompts against it without re-uploading or chunking. Endpoints: summary, Q&A, action items, custom prompt. Best for: meeting recap automation, call center QA, podcast show notes, any post-transcription analysis. Works with: assemblyai Python/Node SDK + LeMUR HTTP endpoints. Setup time: 5 minutes after a transcript exists.


Summary endpoint

import assemblyai as aai
aai.settings.api_key = ASSEMBLYAI_KEY

transcript = aai.Transcriber().transcribe("call.mp3")

summary = transcript.lemur.summarize(
    final_model=aai.LemurModel.claude3_5_sonnet,
    context="This is a customer support call about a missed refund.",
    answer_format="3 bullet points",
)
print(summary.response)

Custom prompt (most flexible)

prompt = '''
You are a call center QA analyst. Score this support call on:
- Empathy (0-10)
- Resolution clarity (0-10)
- Compliance: was the agent's name stated, was a case number provided?

Return strict JSON with these fields plus a 'notes' string under 200 words.
'''

result = transcript.lemur.task(
    prompt=prompt,
    final_model=aai.LemurModel.claude3_5_sonnet,
    temperature=0.0,
    max_output_size=600,
)
import json
print(json.loads(result.response))

Q&A endpoint (multi-question)

qa = transcript.lemur.question(
    questions=[
        aai.LemurQuestion(question="What was the customer's main complaint?"),
        aai.LemurQuestion(question="Did the agent offer a refund? If yes, how much?"),
        aai.LemurQuestion(question="What's the recommended next action?", answer_format="one sentence"),
    ],
    final_model=aai.LemurModel.claude3_5_sonnet,
)
for r in qa.response:
    print(r.question, "→", r.answer)

Action items

action_items = transcript.lemur.action_items(
    final_model=aai.LemurModel.claude3_5_sonnet,
    context="Internal product planning meeting.",
)
print(action_items.response)

Available models

Model Best for
claude3_5_sonnet Default — best quality, balanced cost
claude3_haiku Cheap, fast for short summaries
claude3_opus Top quality, slowest, highest cost
default AssemblyAI-tuned fast model

FAQ

Q: Why use LeMUR instead of feeding transcript to Claude myself? A: Three reasons: (1) the transcript stays in AssemblyAI's secure data plane — no re-upload of potentially-PII content; (2) you skip the chunking + context management plumbing; (3) it's one billing invoice. For one-off scripts, calling Claude directly is fine; for production analyze-every-call flows, LeMUR is simpler.

Q: Can I run LeMUR on multiple transcripts at once? A: Yes — aai.Lemur().task(transcript_ids=[id1, id2, id3], prompt=...). Useful for weekly call-portfolio analysis. 100 transcripts max per call.

Q: Does LeMUR support tool calls? A: Not yet — LeMUR is text-in/text-out. For tool use, fetch the transcript, then pass it to your own Claude/OpenAI call with tools enabled.


Source & Thanks

Built by AssemblyAI. LeMUR docs at assemblyai.com/docs/lemur.

AssemblyAI/assemblyai-python-sdk

🙏

Source & Thanks

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets