Letta
Also known as: Letta framework, MemGPT successor, stateful agent framework
- Letta
- Letta is an open-source agent framework that gives large language models persistent memory, letting AI agents store user facts, recall past conversations, and maintain consistent state across sessions through structured memory blocks the agent itself reads and edits.
Letta is an open-source framework for building AI agents with persistent memory, letting them remember conversations, learn from past interactions, and maintain context across sessions instead of resetting each time.
What It Is
Most large language models start every conversation from scratch. They have no memory of you, your project, your last question, or what you decided yesterday. For a chatbot answering one-off questions, that reset is fine. For an agent meant to assist over weeks of work, it falls apart. Letta exists to close that gap by giving agents a persistent, structured memory they carry across sessions.
Letta is an open-source framework for building stateful AI agents. It originated from the MemGPT research paper out of UC Berkeley in 2023, which proposed treating an LLM more like an operating system that swaps information in and out of its limited context window. The Letta project turned that idea into a usable runtime that developers can deploy.
The memory model is structured rather than free-form. Each agent owns a set of memory blocks — short, editable text sections for things like the user’s identity, ongoing project state, or current goals. The agent can read these blocks, rewrite them through tool calls, and decide what is worth keeping. Older context that no longer fits in the prompt gets archived to long-term storage, which the agent can search later when relevant. The LLM itself stays stateless; the framework manages what flows in and out.
Letta also keeps the entire agent state in a database, so a session can be paused, inspected, and resumed without losing what the agent knows. That makes agent behavior auditable: you can open the memory and read exactly what the system thinks it knows about a user or task — a property that matters when an agent is meant to operate over long horizons in the broader category of agent memory systems.
How It’s Used in Practice
The most common entry point is a developer building a chatbot or assistant that needs to recognize returning users. Instead of asking the same questions every session — name, role, preferences, current project — the agent stores those facts in its memory blocks and pulls them automatically when the user comes back. Over weeks of use, the assistant builds a working picture of the person it serves.
Beyond simple recall, teams use Letta for longer-running agents: research assistants that follow a topic across many sittings, customer-support bots that retain ticket history, or internal tools that learn a team’s vocabulary and conventions. Because the memory is structured and editable, developers can also seed it — pre-loading product knowledge, escalation rules, or persona guidelines before the agent ever talks to a user.
Pro Tip: Be explicit about what your agent should NOT remember. Memory blocks fill up fast, and an agent that hoards every detail becomes slow and noisy. Define a small number of well-named blocks (user_profile, current_task, preferences) and let the agent rewrite them in place rather than appending forever. Treat memory like a working spec, not a journal.
When to Use / When Not
| Scenario | Use | Avoid |
|---|---|---|
| Building an assistant that needs to remember user preferences across sessions | ✅ | |
| One-off Q&A chatbot where every session is independent | ❌ | |
| Long-running research or coding agent that follows a project for weeks | ✅ | |
| Strict data residency setup with no capacity to host an agent service | ❌ | |
| Internal tool that should learn team vocabulary and style over time | ✅ | |
| Quick prototype where you only need a single short conversation | ❌ |
Common Misconception
Myth: Letta is just a vector database wrapped around a chat interface. Reality: Letta is a stateful agent runtime. Vector search is one piece, but the framework also manages structured memory blocks the agent itself can edit, full agent state persistence, and the logic for deciding what stays in the active prompt versus what gets archived. The memory is something the agent operates on, not a passive lookup store.
One Sentence to Remember
Letta gives an AI agent something models alone do not have — a memory it can read, edit, and carry forward — and that single shift turns a stateless chatbot into something closer to a colleague who actually remembers what you said last week.
FAQ
Q: What is the difference between Letta and MemGPT? A: MemGPT is the original research paper describing how an agent can manage its own memory like an operating system. Letta is the open-source framework that productized those ideas into a usable runtime for building stateful agents.
Q: Does Letta work with any large language model? A: Letta is model-agnostic and supports leading commercial LLMs as well as open-weight models. You choose which model powers the agent; Letta handles the memory, state, and tool orchestration around it.
Q: Is Letta free to use? A: Letta is open-source and self-hostable, so the framework itself can be run at no cost. A managed cloud option exists for teams that prefer not to operate their own infrastructure, with tiered pricing available.
Expert Takes
The interesting principle here is that memory in agents is not the same as model parameters. Letta separates the two cleanly — the LLM stays stateless, while a structured store holds facts the agent can read, edit, and reorganize. This matters because it makes the system inspectable. You can open the memory and see what the agent knows. With weights alone, you cannot.
Treat Letta as a runtime that turns memory into part of the spec. You describe what the agent should remember, what it should forget, and how memory blocks compose into the prompt. The framework then keeps that contract intact across sessions. The fix for most stateful-agent failures is usually clearer specification of memory rules, not more storage capacity.
Persistent memory is becoming table stakes for any serious agent product. Customers expect tools that remember preferences, history, and context. Letta sits in the open-source layer of that emerging stack, giving builders a way to ship stateful agents without locking into a closed vendor. The teams that get memory right early will own the user relationship later.
Persistent memory raises a quiet question: who owns the record of what an agent knows about you? When a system stores your habits, preferences, and past mistakes, the line between assistant and surveillance gets thinner. Letta exposes those memory blocks for inspection, which is honest. The harder question is whether users will ever read them, or even know they exist.