Use Ctrl+P (or Cmd+P) to save as PDF. Back to paper

Context Is Not Memory

inspiration | devinfo.dev | May 23, 2026 | devinfo.dev:2026.0002

A large context window does not make an LLM remember. It makes it attend. The distinction changes how you build.

Developers routinely confuse context with memory. They are not the same thing.

Memory implies persistence. Something stored, retrievable, updated over time. A context window is none of these. It is a snapshot — a fixed slice of tokens the model attends to during a single forward pass. When the call ends, it is gone.

This matters because it changes what you should build.

If you treat a long context window as memory, you build systems that silently fail. The model appears to "remember" across turns — until it does not. Until the conversation grows long enough, or the chunk you needed fell outside the window.

The correct mental model: the context window is a desk, not a filing cabinet. Everything the model knows during inference must be placed on that desk before it reasons. You decide what goes on the desk. That is the job.

This reframes the engineering problem:

The models are getting better at using long contexts efficiently. But the ceiling on what fits — and what the model actually attends to within that context — is not solved. Important information buried in the middle of a 128K context is not guaranteed attention. This is called lost-in-the-middle, and it is a documented failure mode.

Build as if context is scarce, even when it is not. The discipline makes your system faster, more predictable, and easier to debug.

The desk metaphor holds: a cluttered desk slows thinking. So does a cluttered context.

References