Letta — AI Agent Long-Term Memory Framework
Build AI agents with persistent memory using MemGPT architecture. Letta manages context windows automatically with tiered memory for stateful LLM applications.
Installation agent prête
Cet actif peut être installé après choix du runtime, vérification du plan et exécution de la commande adaptée.
npx -y tokrepo@latest install 4a18797f-d627-4282-952d-df53680a19f0 --target codexÀ exécuter après confirmation du plan en dry-run.
What it is
Letta, formerly known as MemGPT, is an open-source framework for building AI agents that maintain persistent long-term memory across conversations. It addresses the fundamental context window limitation of LLMs by implementing a tiered memory architecture inspired by operating system virtual memory.
The framework is aimed at developers building stateful LLM applications -- chatbots that remember users, research agents that accumulate knowledge, and assistants that improve over time without losing context.
How it saves time or tokens
Without Letta, developers must manually manage context windows: truncating old messages, summarizing history, or re-injecting relevant facts. Letta automates this entirely. The agent decides what to store in core memory (always in context), recall memory (searchable conversation history), or archival memory (unlimited vector-indexed storage). This reduces wasted tokens on context management boilerplate and avoids the common failure mode where agents forget critical user preferences mid-conversation.
The workflow estimates around 4,100 tokens for a basic setup, but the real savings come from eliminating repeated context injection across sessions.
How to use
- Install Letta and start the server:
pip install letta
letta server
- Create a client and configure an agent with memory:
from letta import create_client
client = create_client()
agent = client.create_agent(
name='my_agent',
memory=client.create_block(
'You are a helpful assistant.',
label='system'
),
)
- Send messages and let the agent manage its own memory:
response = agent.send_message(
'Remember: my favorite color is blue.'
)
print(response.messages)
Example
from letta import create_client
client = create_client()
# Create agent with tiered memory
agent = client.create_agent(
name='research_assistant',
memory=client.create_block(
'You are a research assistant that remembers '
'all papers the user has discussed.',
label='system'
),
)
# Agent stores facts in archival memory automatically
agent.send_message('I just read the Attention Is All You Need paper.')
agent.send_message('What papers have I mentioned?')
Related on TokRepo
- AI memory frameworks compared -- Browse all memory solutions including Mem0, Zep, and MemGPT variants
- Letta deep-dive on TokRepo -- Dedicated page for Letta architecture and usage patterns
- Agent tools directory -- Other frameworks for building autonomous AI agents
Common pitfalls
- Running
letta serverwithout sufficient disk space for the SQLite-backed archival memory can cause silent failures on large datasets - The tiered memory system works best with models that follow function-calling conventions; smaller open models may not reliably trigger memory operations
- Confusing Letta (the rebranded project) with the original MemGPT academic paper -- the API surface has changed substantially since the rename
Questions fréquentes
Letta is the rebranded and production-ready version of MemGPT. The original MemGPT was a research project demonstrating virtual memory for LLMs. Letta took that concept and built a full agent framework with a server, REST API, and multi-user support. The core idea of tiered memory remains the same, but the API and architecture have evolved significantly.
Letta implements three memory tiers: core memory (always present in the LLM context window), recall memory (searchable conversation history stored in a database), and archival memory (unlimited vector-indexed storage for long-term facts). The agent autonomously decides what to promote or demote between tiers.
Yes, Letta supports multiple LLM backends including OpenAI, Anthropic, and local models via endpoints compatible with the OpenAI API format. However, the memory management functions work most reliably with models that have strong function-calling capabilities.
When the context window fills up, Letta automatically summarizes older messages and moves them to recall memory. Critical facts flagged by the agent are stored in archival memory for later retrieval. This process happens transparently without developer intervention.
Letta provides a server mode with REST API endpoints, making it suitable for multi-user deployments. Each agent maintains its own isolated memory state. The server supports PostgreSQL as a backend for production workloads, replacing the default SQLite for better concurrency.
Sources citées (3)
- Letta GitHub— Letta implements tiered memory architecture for AI agents
- MemGPT Paper— MemGPT virtual memory concept for LLMs
- Letta Documentation— Function calling enables agents to manage their own memory
En lien sur TokRepo
Source et remerciements
Created by Letta Team. Licensed under Apache 2.0.
letta-ai/letta — 12k+ stars
Fil de discussion
Actifs similaires
Mem0 — Long-Term Memory Layer for AI Agents
Add persistent memory to AI agents and assistants. Remembers user preferences, context, and past interactions across sessions.
LycheeMem — Lightweight Long-Term Agent Memory
LycheeMem provides lightweight long-term memory for agents (SQLite + LanceDB) with reranking and runtime plugins. Verified 233★; setup ~10–20 minutes.
Letta — Stateful AI Agents with Memory
Letta builds stateful AI agents that learn and self-improve with advanced memory. 21.8K+ stars. CLI, Python/TS SDKs, skills, subagents. Apache 2.0.
Cortex — Horizontally Scalable Long-Term Storage for Prometheus
Cortex is a CNCF project that provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus metrics, letting you run Prometheus-as-a-Service at scale.