# Letta — AI Agent Long-Term Memory Framework

> Build AI agents with persistent memory using MemGPT architecture. Letta manages context windows automatically with tiered memory for stateful LLM applications.

## Install

Save the content below to `.claude/skills/` or append to your `CLAUDE.md`:

## Quick Use

```bash
pip install letta
letta server
```

```python
from letta import create_client

client = create_client()
agent = client.create_agent(
    name="my_agent",
    memory=client.create_block("You are a helpful assistant.", label="system"),
)
response = agent.send_message("Remember: my favorite color is blue.")
print(response.messages)
```

## What is Letta?

Letta (formerly MemGPT) is a framework for building AI agents with persistent, long-term memory. It solves the context window limitation by implementing a tiered memory architecture — core memory (always in context), recall memory (conversation history), and archival memory (unlimited storage). The agent manages its own memory, deciding what to remember and forget.

**Answer-Ready**: Letta is an AI agent framework with persistent memory management. Uses tiered memory (core/recall/archival) to overcome context window limits. Formerly MemGPT. Agents self-manage memory across conversations. 12k+ GitHub stars.

**Best for**: Developers building stateful AI agents that need to remember across sessions. **Works with**: OpenAI, Anthropic, local models via Ollama. **Setup time**: Under 3 minutes.

## Core Features

### 1. Tiered Memory Architecture

| Memory Tier | Purpose | Size |
|-------------|---------|------|
| Core | Always in context, editable by agent | ~2K tokens |
| Recall | Searchable conversation history | Unlimited |
| Archival | Long-term knowledge storage | Unlimited |

### 2. Agent Self-Management

```python
# Agent decides what to save
agent.send_message("My meeting is at 3pm tomorrow with Sarah about the Q2 budget.")
# Agent automatically stores this in archival memory
```

### 3. Tool Use

```python
from letta import tool

@tool
def search_web(query: str) -> str:
    "Search the web for information."
    # Your search implementation
    return results

agent = client.create_agent(tools=[search_web])
```

### 4. REST API Server

```bash
letta server --port 8283
# Full REST API for agent management
# POST /v1/agents - Create agent
# POST /v1/agents/{id}/messages - Send message
```

## Use Cases

| Use Case | How |
|----------|-----|
| Personal Assistant | Remember user preferences across sessions |
| Customer Support | Track customer history and context |
| Research Agent | Accumulate findings over long investigations |
| Coding Companion | Remember codebase context and decisions |

## FAQ

**Q: How does it differ from RAG?**
A: RAG retrieves from static documents. Letta agents actively manage their own memory — writing, updating, and deleting memories as conversations evolve.

**Q: Can I use local models?**
A: Yes, supports Ollama, vLLM, and any OpenAI-compatible endpoint.

**Q: Is it production-ready?**
A: Yes, Letta Cloud offers managed hosting. Self-hosted server supports Docker deployment.

## Source & Thanks

> Created by [Letta Team](https://github.com/letta-ai). Licensed under Apache 2.0.
>
> [letta-ai/letta](https://github.com/letta-ai/letta) — 12k+ stars

<!-- ZH -->


## Quick Start

```bash
pip install letta
letta server
```

Three lines of code to create an AI agent with persistent memory.

## What is Letta?

Letta (formerly MemGPT) is a framework for building AI agents with long-term memory. It breaks through context window limits using a tiered memory architecture (core / recall / archival), with the agent autonomously managing its memory.

**In one sentence**: AI agent long-term memory framework — tiered memory architecture breaks through context limits, the agent decides what to remember and forget — 12k+ stars.

**For**: Developers building AI agents that need cross-session memory.

## Core Features

### 1. Tiered Memory
Core memory (always in context), recall memory (conversation history), archival memory (unlimited storage).

### 2. Agent-Managed Memory
The agent autonomously decides which information to store in long-term memory.

### 3. Tool Calling
Supports custom tools defined via Python decorators.

### 4. REST API
Built-in server with a complete REST API for managing agents.

## FAQ

**Q: How is it different from RAG?**
A: RAG retrieves from static documents; Letta agents actively manage their own memory.

**Q: Does it support local models?**
A: Yes — Ollama, vLLM, and others.

## Source & Thanks

> [letta-ai/letta](https://github.com/letta-ai/letta) — 12k+ stars, Apache 2.0

---
Source: https://tokrepo.com/en/workflows/letta-ai-agent-long-term-memory-framework-4a18797f
Author: Agent Toolkit