TOKREPO · ARSENAL
Refreshed

Memory Layer for Agents

Mem0, Zep, Cognee, and the patterns to make agents remember across sessions — without baking everything into the prompt.

7 assets

What's in this pack

This pack collects the seven memory-layer assets that show up in every agent that needs to remember things between sessions without re-pasting them into the prompt every time. Three are the canonical libraries. Four are pattern templates that wrap them — patterns Anthropic and OpenAI both surface in their long-running-agent guides.

# Asset Type What it gives you
1 Mem0 library Auto-extract & update user facts, drop-in API
2 Zep service Temporal knowledge graph, long-term memory
3 Cognee library Graph + vector hybrid memory pipeline
4 Episodic-summary pattern template Compress long sessions into summary memories
5 Working-memory scratchpad template Inter-step state without prompt bloat
6 User-fact extractor template Pull stable facts from chat into a memory store
7 Cross-session recall template "What did we decide last week?" pattern

Why this matters

The default Claude / GPT-4 / Gemini setup has zero memory. Every conversation starts fresh. Most apps fake memory by stuffing previous turns into the system prompt — that works for a while, then your context window blows up, your bill triples, and the model loses the plot. Memory layers solve this by storing facts outside the prompt and only injecting the relevant ones per turn.

The three libraries each pick a different bet:

  • Mem0 is the easiest. One mem0.add(messages, user_id=...) call and the library extracts what's worth remembering. Best for chatbot-style apps with a clear user identity.
  • Zep is the production option. Runs as a service, gives you a temporal knowledge graph (memories with timestamps and relationships), and supports multi-tenant. Best when you need audit trails or memory shared across an org.
  • Cognee is the graph-native bet. It models memory as a knowledge graph from day one — useful if your domain is research, code, or anything with strong entity relationships.

The four patterns aren't libraries — they're prompt templates and small adapters that work with any of the three. They're the difference between "I installed Mem0" and "memory actually works in my app."

Install in one command

# Install the entire pack
tokrepo install pack/agent-memory-layer

# Or install one library at a time
tokrepo install mem0
tokrepo install zep
tokrepo install cognee

The TokRepo CLI normalizes file placement: Claude Code subagents into .claude/agents/, Cursor rules into .cursor/rules/, AGENTS.md entries for Codex CLI. The library installs are pip/npm — TokRepo just wires them into your AI tool's config so the agent knows the memory layer exists.

Common pitfalls

  • Don't store everything. Memory cost scales with what you write, not what you retrieve. Use a fact extractor (pattern #6) to filter — only durable facts about the user/project belong in long-term memory.
  • Don't skip the recency bias. Pure vector recall pulls semantically-similar but stale memories. Zep's temporal graph and Mem0's update-in-place both fix this; if you roll your own, weight by recency or you'll keep retrieving 6-month-old context.
  • Don't share user IDs across tenants. All three libraries support per-user namespaces. Use them. Memory leakage between users is a much worse incident than no memory at all.
  • Token budget the recall step. Even with a memory layer, you can blow your context window if you set top_k=50 for retrieval. Start at top_k=5 and tune up only if recall is missing.
  • Reconcile on conflict. If the user says "I'm vegetarian" in March and "I'm vegan" in May, you need an update strategy. Mem0 handles this automatically; Zep gives you the conflict surface; Cognee leaves it to you.

Common misconceptions

"RAG and memory are the same thing." They're not. RAG retrieves from a static corpus (docs, codebase). Memory writes new entries based on what the user/agent said and retrieves them later. RAG is read-only; memory is read-write. The patterns in pack/rag-pipelines are different from this pack on purpose.

"I can just use the conversation history." For a 5-turn session, sure. For an app where the same user comes back next week, no — you'd have to feed every prior turn into the prompt forever. Memory extracts the facts and discards the chat.

"Mem0 vs Zep is a hard choice." Most teams use Mem0 first because it's a 5-minute setup, then graduate to Zep when they need multi-tenant or audit. The two are similar enough that migration is a weekend, not a quarter.

INSTALL · ONE COMMAND
$ tokrepo install pack/agent-memory-layer
hand it to your agent — or paste it in your terminal
What's inside

7 assets in this pack

Skill#01
Mem0 — Memory Layer for AI Applications

Add persistent, personalized memory to AI agents and assistants. Mem0 stores user preferences, past interactions, and learned context across sessions.

by Mem0·278 views
$ tokrepo install mem0-memory-layer-ai-applications-96da1f40
MCP#02
Codebase Memory MCP — Code Intelligence for AI Agents

High-performance code intelligence MCP server. Indexes repos in milliseconds via tree-sitter AST, supports 66 languages, sub-ms graph queries. MIT, 1,300+ stars.

by MCP Hub·121 views
$ tokrepo install codebase-memory-mcp-code-intelligence-ai-agents-a3fe5165
MCP#03
Memory MCP — Persistent AI Agent Knowledge Graph

MCP server that gives AI agents persistent memory using a local knowledge graph. Stores entities, relationships, and observations across sessions for Claude Code.

by MCP Hub·82 views
$ tokrepo install memory-mcp-persistent-ai-agent-knowledge-graph-554c4dc2
Skill#04
Zep — Long-Term Memory for AI Agents and Assistants

Production memory layer for AI assistants. Zep stores conversation history, extracts facts, builds knowledge graphs, and provides temporal-aware retrieval for LLMs.

by MCP Hub·79 views
$ tokrepo install zep-long-term-memory-ai-agents-assistants-ffde39a9
Skill#05
Cognee — Memory Engine for AI Agents

Cognee adds persistent structured memory to any AI agent in 6 lines of code. 14.8K+ stars. Knowledge graphs, vector stores, LLM integration. Apache 2.0.

by Skill Factory·82 views
$ tokrepo install cognee-memory-engine-ai-agents-b6ad223f
Prompt#06
AI Agent Memory Patterns — Build Agents That Remember

Design patterns for adding persistent memory to AI agents. Covers conversation memory, entity extraction, knowledge graphs, tiered memory, and memory management strategies.

by Agent Toolkit·124 views
$ tokrepo install ai-agent-memory-patterns-build-agents-remember-b52189f9
Script#07
Mem0 — Memory Layer for AI Agents

Add persistent, personalized memory to any AI agent. Learns user preferences, adapts context, reduces tokens. 51K+ stars, used by 100K+ devs.

by Mem0·137 views
$ tokrepo install mem0-memory-layer-ai-agents-b61fca8c
FAQ

Frequently asked questions

Is Mem0 free?

The Mem0 OSS library is MIT-licensed and free to self-host. They also have a managed cloud option with usage-based pricing if you don't want to run the embedding/vector store yourself. Zep has the same OSS + cloud model. Cognee is fully OSS with no managed option as of mid-2026 — you run it yourself.

Will this work in Cursor / Codex CLI / Windsurf?

The libraries are language-level (Python / Node) so they work with any agent framework, not just Claude Code. The TokRepo CLI installs the right config files for each AI tool. Codex CLI users should pair the memory layer with AGENTS.md instructions; Cursor users embed it in the rule set.

How does Mem0 compare to Zep?

Mem0 is library-first — you import it and call .add()/.search() inline. Zep is service-first — you run a server (Docker), it owns the graph, and your app calls the API. Mem0 wins on time-to-first-memory; Zep wins on multi-tenant, audit, and explicit relationship modeling. Pick Mem0 for prototypes, Zep when you have ops support.

What's the difference vs the RAG Pipelines pack?

RAG retrieves from a fixed corpus (your docs, your codebase). Memory writes new facts as the agent runs and retrieves them later. RAG is read-only; memory is read-write and accumulates. Most production agents need both: RAG for static knowledge, memory for the user-specific stuff.

When should I NOT add a memory layer?

When sessions are stateless and short — single-shot tasks like 'summarize this PDF' don't benefit from memory and the layer adds latency. Also skip it for purely factual lookup (use RAG instead). Memory layers are worth their cost when the same user comes back, the agent is multi-step, or both.

MORE FROM THE ARSENAL

12 packs · 80+ hand-picked assets

Browse every curated bundle on the home page

Back to all packs