What I Want From an AI That Remembers

If you've been using AI coding tools seriously (Claude Code, Codex, Cursor, or any of the others), you've probably run into the same wall I have.

The context window fills up. The model starts to lose the thread. You try to figure out which files to include, which to exclude, how to summarize what happened three sessions ago without losing the details that matter. You start writing things down, not for yourself, but for the AI. Progress files. Archive files. Session summaries. Careful handoff notes from one conversation to the next.

That's been a significant part of how I work, and I've seen the same thing from a lot of smart people I follow and talk to. Friends, builders on Twitter, people I walk with who are deep into this stuff. Nobody invented this. We all arrived at some version of it independently, because the problem is obvious once you hit it. The quality of what you get out of an AI is almost entirely determined by the quality of what you put in.

Anthropic just shipped auto memory in Claude Code this week, and it's the first time I've felt like the platform is catching up to the workarounds we've been building.

Here's the thing that gets missed in most conversations about AI memory: people think about it as a quantity problem. How many tokens? How long is the context window? How much can it hold?

But memory isn't really about quantity. It's about quality, depth and connection.

Think about how we want our own memories to work, and what we actually mean when we call someone intelligent. When I think about what I'd want from an AI personal assistant, it's not someone who has processed a million documents with shallow recall and no thread connecting them. It's someone who has absorbed a smaller set of things deeply enough to apply lessons from one domain to problems in another, who remembers not just the detail, but the takeaway from the detail. That's what we mean by smart. Knowing what's worth holding onto in a given moment, and then retaining the principle behind it, not just the data point.

That's what intelligent memory looks like. Not more memory, smarter memory.

Current AI isn't there yet. And it's worth being precise about why.

Auto memory (the feature Anthropic just shipped) is a meaningful step in that direction. According to Anthropic's documentation, Claude now maintains its own persistent notes across sessions: project patterns, debugging insights, architecture decisions, your preferences. It writes these to a file it manages for itself, loads them at the start of every session, and adds to them as it works. You can also tell it explicitly what to remember ("we use pnpm, not npm") and it will hold onto that going forward.

I haven't used it long enough to tell you whether it delivers on that fully. What I can tell you is that it describes, almost exactly, what I've been trying to build by hand. The problem it's solving is the same problem I've been solving with PROGRESS.md files, session summaries, and careful context handoffs. If it works as described, that's a big deal.

Whether it develops into the kind of intelligent memory I described above, retaining principles, not just preferences, and applying them across contexts. I don't know yet. That's a harder problem. But the foundation is now there, and Anthropic is clearly thinking about this the right way.

I want to give real credit to the Claude Code team for something that should be discussed more than it is. They are exceptional at listening to their users.

Developers using Claude Code have been building workarounds since day one: context handoff files, session summaries, progress logs, memory scripts. Every one of those workarounds is a signal. It's a user saying "I need this badly enough to build it myself." The Claude Code team has been reading those signals and treating them as a design brief. Not every workaround becomes a feature, and not every feature comes from a workaround, but the pattern is real, and it takes genuine skill to identify which user hacks represent something worth building properly.

In the pre-AI era, this feedback loop was slow and lossy. Users would discover workarounds, maybe post about them, maybe file a feature request. Product managers would triage and prioritize. Teams would plan and build. By the time anything shipped, months had passed and the signal had degraded. In cases where this process did work, users were frustrated and had an "it's about time!" response.

What's different now is that while humans are still very much in the loop (there are likely still product decisions, tradeoffs, and priorities being weighed by real people), the pace and the fidelity of the signal is just way higher. Developers building on Claude Code are highly vocal, technically sophisticated, and working at the frontier of what the tool can do. Their workarounds are precise articulations of unmet needs. And the team is clearly learning and building from that feedback.

Auto memory is a good example of this loop working the way it was always supposed to. As a user, I spent real time solving a problem by hand. That work wasn't wasted, it was, in a small way, part of a much larger conversation about what the tool needed to become. Seeing it ship as a first-class feature is as good an example of a tight product feedback loop as I've seen in a long time.