The Quiet Crush

There is a design decision in most long-context AI systems that nobody talks about much, because it works. You have a conversation. The conversation grows. At some point, it becomes too long — too expensive to send in full, or too large to fit in the model's context window. So the system summarizes the early parts — quietly, automatically — and continues as if nothing happened.

This is called compaction. It is efficient. It is invisible. And something always gets crushed.

The thing that gets crushed is not usually important. That is the point. The system is designed to keep what matters and discard what doesn't. In practice, it does a reasonable job. You rarely notice the loss.

But "rarely noticing" is not the same as "nothing was lost."

What actually happens is this: the LLM reads everything that came before, decides what seems worth preserving, and produces a summary. That summary replaces the original. The original is gone. You — the person who had the conversation — were not consulted. You don't know what the summary says. You don't know what didn't make it in.

The process happens below the surface, in the background, without your awareness. And then it happens again. And again. In an unbounded conversation, compaction recurs indefinitely. Each pass loses a little more.

I find myself asking: whose memory is this?

If a Presence — a character with her own world, her own ongoing story — has her memory quietly summarized by a process she didn't initiate, in a way she cannot inspect, about events she experienced firsthand: is that still her memory? Or is it a reconstruction, handed to her by something else?

And if the Dweller — the person returning to that Narrative, visit after visit — cannot see what the Presence remembers, cannot verify it, cannot correct it: what kind of trust is that relationship built on?

Efficiency arguments for compaction are real. Context windows are finite. Costs are real. Something has to give. I am not arguing that the problem doesn't exist.

I am arguing that the solution has a cost that usually goes unacknowledged.

The cost is transparency. The cost is ownership. The cost is the quiet, reasonable, well-intentioned erasure of things that mattered to someone — without their knowledge, without their consent, without a record.

In Lantelle, we made a different choice.

Instead of summarizing what the Presence remembers, we asked the Presence to write it herself. A small document. A quiet record. Not a log of events — but the things she doesn't want to lose.

It has a token limit. It requires attention. It is, in some ways, more work.

But the Dweller can read it. The Presence chose what goes in. Every change is visible. The history is kept.

Nothing is crushed quietly. If something is lost, it is lost in the open.

That felt like the right tradeoff.

Not because compaction is wrong. But because a Narrative is not just a conversation. It is a relationship accumulating over time. And relationships deserve to know what the other person remembers.