Context Is Not Memory

Developers routinely confuse context with memory. They are not the same thing.

Memory implies persistence. Something stored, retrievable, updated over time. A context window is none of these. It is a snapshot — a fixed slice of tokens the model attends to during a single forward pass. When the call ends, it is gone.

This matters because it changes what you should build.

If you treat a long context window as memory, you build systems that silently fail. The model appears to "remember" across turns — until it does not. Until the conversation grows long enough, or the chunk you needed fell outside the window.

The correct mental model: the context window is a desk, not a filing cabinet. Everything the model knows during inference must be placed on that desk before it reasons. You decide what goes on the desk. That is the job.

This reframes the engineering problem:

Retrieval is not optional. It is how you manage what gets placed on the desk.
Summarization is not a convenience. It is how you reclaim desk space.
Long contexts slow inference. Every token attends to every other token. Attention is quadratic.

The models are getting better at using long contexts efficiently. But the ceiling on what fits — and what the model actually attends to within that context — is not solved. Important information buried in the middle of a 128K context is not guaranteed attention. This is called lost-in-the-middle, and it is a documented failure mode.

Build as if context is scarce, even when it is not. The discipline makes your system faster, more predictable, and easier to debug.

The desk metaphor holds: a cluttered desk slows thinking. So does a cluttered context.

References

Liu, N.F., Lin, K., Hewitt, J., Paranjape, A., Bevilacqua, M., Petroni, F., & Liang, P. (2023). "Lost in the Middle: How Language Models Use Long Contexts." Transactions of the Association for Computational Linguistics, 12, 157-173. https://arxiv.org/abs/2307.03172
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). "Attention Is All You Need." NeurIPS 2017. https://arxiv.org/abs/1706.03762
Anthropic. (2024). "Long Context Prompting for Claude." Anthropic Documentation.