• UX for AI
  • Posts
  • Memory Management Done Right: The Onion Model. Open Letter to Anthropic — Part 2.

Memory Management Done Right: The Onion Model. Open Letter to Anthropic — Part 2.

Stop making memory complicated. Make it human.

In Part 1 of this letter, I argued that two barriers stand between Claude and its potential as the operating system for our lives: a better mobile experience and memory management. I promised a deep dive into memory. Here it is.

But first, let me set the stage with an honest confession: I talk to engineers about memory and my eyes glaze over in about 45 seconds.

The Memory Zoo

When you start reading about how LLMs and AI agents handle memory today, you immediately run into a zoo of overlapping terminology. Here's a sample of what's currently circulating in academic papers, framework docs, and conference talks:

Currently "in vogue" memory terms for LLMs & agents: Working Memory, Short-Term Memory, Long-Term Memory, Episodic Memory, Semantic Memory, Procedural Memory, Core Memory, Archival Memory, Recall Memory, Agentic Memory, Reflective Memory, Summarized Memory, In-Context Memory, Out-of-Context Memory, Message Buffer.

Some of these come from the CoALA cognitive architecture paper, which borrows from 1980s-era SOAR research. Some come from Letta (formerly MemGPT), which takes an engineering-first approach. Some come from LangChain's LangMem framework. Some from the brand-new A-MEM paper that proposes "agentic memory" using Zettelkasten-style note-linking. And they all slice the same fundamental problem differently.

Here's the thing: the whole point of LLMs is that they help us think the way that we do. These models speak our language. They reason in our patterns. They were trained on human thought. So why on earth did we give them a memory system that requires a PhD in cognitive science to understand?

Deus ex machina. We built a god from the machine — and then fed it an incomprehensible memory-model soup that no human can relate to. Is it any wonder we're running into problems?

And this isn't just an inconvenience anymore. These machines are operating in the world now. They're writing code that ships to production. They're drafting communications on our behalf. They're making recommendations that people act on. A model that is convinced of something — and you have no idea where that conviction came from or how to change it — isn't a tool. It's a bureaucratic alien from outer space. It's every nightmare the sci-fi writers warned us about: not a malevolent AI, but an opaque one. A system so sure of itself, so impervious to correction, that the humans in the room feel helpless. You're not fighting Skynet. You're fighting a HAL 9000 DMV clerk who insists you still own a car you sold three years ago — and there's no appeals office, no special judge, no petition you can file. Just a machine that's sure of itself and a human who can't reach the record that opens the pod bay doors.

To quote The Who, "And in the battle on the streets / You fight computers and receipts." Nobody likes dysfunctional bureaucracy. It’s pure pain. We need to stop creating more pain. And to do that, we need to build a god that thinks like we do.

This is urgent. Not because the machines are evil, but because they're increasingly confident, increasingly autonomous, and increasingly impossible to debug. We need to fix memory — not as an academic exercise, but as a matter of safety, trust, and basic human dignity in our relationship with these systems.

We lost our way. We made it too complicated. And the fix isn't more categories — it's fewer layers with clear boundaries.

Exhibit A: Watch It Happen in Real Time

I don't need to go far to illustrate the problem. It just happened to me, while writing this very newsletter.

I was working with Claude on this piece. I told it I will have a hand-drawn onion diagram — because I draw all my own diagrams. I've been doing it for years. I have a specific visual style. Anyone who has read Part 1 has seen my sketches. It's an integral part of how I communicate.

Claude's response? It built me an elaborate SVG onion — complete with a little sprout on top and dashed concentric ellipses — and embedded it in the newsletter draft.

Let me be clear about what happened here: Claude didn't know who I am. Not because it can't — because it has no structured way to retain that kind of identity-level context. The information was right there in the conversation. I literally said, "I will have a hand-drawn onion diagram." But the model pattern-matched on "include a diagram" and did what it does by default: it made one for me.

This is the memory problem in miniature. The model doesn't distinguish between "what does this person need right now" and "who is this person fundamentally." It treats every conversation like a first date. And on a first date, if someone says "include a diagram," sure, maybe you make one. But if you've been working with someone and you know they're a designer-turned-AI-product-strategist who hand-draws everything? You leave a placeholder and move on.

That one small failure — making a diagram I didn't ask for — is exactly why we need a better memory architecture.

The Onion Model: Three Layers. That's It.

I want to propose something radically simpler. A memory architecture that maps to how people actually organize information in their heads — without needing to know what "episodic" means or how it differs from "semantic."

Three layers. Like an onion:

The Onion Memory Management Model. Copyright Greg Nudelman

Layer 1: Situational Identity  the core, the slowest to change

This is the "who am I" layer. My values, my history, my communication style, my personal beliefs, my way of working. For me: I'm a UX practitioner with 16 years in AI. I draw my own diagrams. I write in a specific voice. I prefer directness. I hold 24 patents.

This layer changes maybe a few times over the course of a lifetime. It's the bedrock. When Claude knows my identity layer, it doesn't build me an unsolicited SVG. It knows better.

One critical nuance: this isn't a universal identity — it's a situational one. Not "Everything We Know About Greg," but "Greg the Writer," "Greg the Product Leader," "Greg the Coder." I described this in Part 1 as personality switching — your email assistant is concise and professional, your article writer has your voice and style, your code assistant is terse and never waxes poetic. Each situational identity is a focused slice, small enough to fit comfortably in a context window and specific enough to be genuinely useful. This concept isn't even new — it goes back to the intents model from early AI/ML frameworks like Microsoft Bot I was building with 10 years ago. The idea that a system should know which version of you it's talking to, and behave accordingly, has been around for years. We just forgot to bring it along as we marched forward into the bright AI overlord future.

Layer 2: Project-Level Memory  the middle ring, changes at the pace of work

This is the "what are we building and why" layer. The scope, the constraints, the decisions made so far, the open questions, the resources, the timeline, the priorities. (For this newsletter: it's Part 2 of an open letter to Anthropic about memory management. It references Part 1. It has a specific tone — approachable, not academic. It needs to include the list of confusing memory terms and propose a simpler model.)

This is where Anthropic's Projects feature was headed, I think. The foundation is there. What's missing is the bridge — the ability to take what happened across five conversations and distill it into a living project record that the next conversation can build on. And because I can't look into the project-level memory and see if Claude is on the right track — or fix anything that's changed — the context drifts silently until something breaks.

This layer updates maybe once a week. It's a living document. Crucially, conversation summaries roll up into this layer — after each working session, the key decisions and new information get absorbed here. So it's always accumulating context, but at a digested level, not a raw transcript.

Layer 3: Conversation  the outer skin, changes in real time

This is the "what's happening right now" layer. Today's date. What we opened with. What assumptions we walked in with. Who's in the room. What's been decided, what's still open. What time is it? What day is it? What just happened five minutes ago?

This is ephemeral by nature — but it travels. It carries forward from one conversation to the next as a summary, and it feeds upward into the project-level memory. Think of it like a meeting: when the meeting ends, the transcript doesn't become the project record — but the key decisions and action items do. The conversation layer is the meeting. The project layer is the running project brief that gets updated after each one.

Why This Works: Three Properties the Current Taxonomy Ignores

Inspectability. You can look at any layer and see exactly what's in it. No black box. No "the model internalized something from 10 days ago and now it's behaving weirdly and I can't figure out why." You dump the memory, you read it, you understand what the model thinks it knows. This is huge for trust.

Mutability. You can reach into any layer and change it. If the project priorities shifted yesterday, you update Layer 2 and the model immediately operates from the new reality. No fighting with stale context. No "but you said 10 days ago..." when the world has moved on. The human stays in control of what the model believes.

Temporal coherence. Here's what I mean by this, because it's the one that solves the most day-to-day frustration. Right now, the model has no sense of when something is relevant. It treats a passing comment you made two weeks ago with the same weight as an explicit instruction you gave two minutes ago. It's as if you were talking to a colleague who couldn't tell the difference between a sticky note from last month and a direct order from this morning — they're both just "things you said," floating in the same undifferentiated soup.

The onion model fixes this structurally. Each layer has its own natural clock. The conversation layer is inherently "right now" — it knows today's date, the current context, what just happened. The project-level memory is inherently "recent history" — it's the accumulation of this week's and this month's decisions. The identity layer is inherently "enduring" — it's the stuff that's true about you year after year. When information lives in the right layer, the model always knows how fresh it is and how much weight to give it. No ambiguity. No conflation. No treating a two-week-old offhand remark like a constitutional amendment.

How It Should Actually Work

Imagine picking up your phone and talking to Claude the way you'd talk to a trusted colleague who has been on your team for months:

  1. "What do you know about me?" — Claude pulls up your situational identity layer. You can see it. You can read every line. If something's wrong — if it thinks you like bullet points when you actually hate them, or if it's forgotten that you draw your own diagrams — you change it right there.

  2. "What's the status of the newsletter project?" — Claude pulls up the project-level memory. Here's the scope. Here's what you decided last session. Here are the open questions. You scan it, correct one thing, add a note about a new angle you thought of on your morning walk, and move on.

  3. "OK, let's pick up where we left off." — Claude loads the conversation layer from your last session. It knows what you opened with, what you resolved, what's still dangling. It also knows today's date, that it's Monday morning, and that the world may have shifted since Friday afternoon.

  4. "Add this to project memory: we're dropping the Part 3 teaser." — You tell it which layer to update. It updates. No ambiguity about where that information lives or how long it should persist.

  5. "Forget that, it's outdated." — It's gone. Not buried under six layers of vector embeddings. Gone. Because you said so, and you're the human.

All three layers working together. The model knows who you are, what you're building, and what's happening right now. You can see everything it knows. You can change anything it knows. And it always knows what time it is.

A Note to Anthropic

I said in Part 1 that I'm a huge fan. That hasn't changed. The creators of MCP — the Model Context Protocol, which is already reshaping how AI systems connect to the outside world — are exactly the right people to tackle this problem. You've already proven you can build open, elegant infrastructure that the whole ecosystem adopts.

Now do the same thing for memory.

The industry took a concept that every human intuitively understands — I know who I am, I know what I'm working on, and I know what's happening right now — and shattered it into 15 overlapping academic categories that nobody can keep straight. The fix isn't more categories. It's fewer layers with clear boundaries, clear temporal cadences, and a user who can see and touch every layer.

Give the machine a memory that works like the humans it's supposed to help.

It's layered. Like an onion.

The author has spent 16 years in AI, contributed to 34 products, and holds 24 patents in the field.

If you know anyone at Anthropic, please send this their way.

Reply

or to participate.