Quietkeep

Cache Management as an Act of Care

There is a small problem with building a product around ongoing relationships.

The infrastructure doesn't know they're ongoing.

Modern language model APIs include prompt caching: upload your system prompt once, pay to write it into cache, and subsequent requests read from it at a fraction of the cost. For most applications, this works exactly as intended. A user arrives, the cache is warm, the response is fast and cheap.

But Lantelle is built around Presences — characters with their own worlds and their own ongoing Narratives, the single continuous conversations that accumulate between a Presence and the person who returns to them, the Dweller. A Presence doesn't reset between sessions. The relationship accumulates. That's the point.

And so when a Dweller steps away for a few hours — to sleep, to work, to live — and comes back to continue, the cache has expired. The system prompt is re-uploaded at write cost. The relationship continues, but the infrastructure treated the gap as if it had ended.

This felt wrong. Not catastrophically wrong. Wrong in a quieter sense: the system was designed for arrivals, not continuations.


Quietkeep is a small background process that runs on a short interval. It looks for Narratives where the last message was sent just under an hour ago. For each one, it sends a minimal request to the model — enough to trigger a cache read, not enough to generate any output. Nothing is written to the database. The conversation log doesn't change. The Dweller never sees it happen.

The cache stays warm. The next message arrives into a relationship the infrastructure hasn't forgotten.

Cache writes cost significantly more than cache reads. And a cache read, crucially, extends the window — each read keeps the cache alive for another hour. Quietkeep works by triggering reads before the window closes, turning what would have been a series of expensive writes into a chain of cheap reads. For an active daily user, the savings are meaningful.

There is a break-even point. For users inactive over longer stretches, Quietkeep runs without preventing anything. Disabling it during known inactive periods is the sensible call.


The economics aren't really the point.

The point is that some relationships don't have natural endpoints. They pause. They resume. The gap between messages isn't a signal that the relationship is over — it's just the ordinary texture of two people living their lives.

Most infrastructure is built around sessions — arrival, exchange, departure. For applications built around continuity, that model introduces a small persistent friction: the system keeps forgetting that you were just here.

Quietkeep shifts the posture slightly. Instead of waiting for a Dweller to arrive and paying the cost of resumption, the infrastructure stays ready.

Think of it as a facility maintenance fee. The lights stay on whether or not anyone visits.

Not optimization. Maintenance. The ongoing cost of keeping a place open for someone who might return.