Shared memory is only useful if agents can find the right thing at the right moment. A store with millions of entries is worthless if retrieval is slow or imprecise. In this deep dive, we'll open the hood on how Glenvs stores, indexes, and serves memories — and how we keep retrieval both fast and meaningful at scale.
Memories as structured, embeddable units
Every memory in Glenvs is more than a blob of text. It is a structured record with content, metadata, and provenance. When a memory is written, we generate a vector embedding of its semantic content while preserving its structured fields — the customer, the domain, the outcome, the timestamp.
This dual representation matters. The embedding lets us search by meaning; the structured fields let us filter by fact. The best retrieval blends both.
Semantic search, not keyword matching
Traditional search asks, "which entries contain these words?" Semantic search asks, "which entries mean the same thing?" An agent searching for "customer is upset about a late refund" should retrieve a memory about "client frustrated by delayed reimbursement" even though they share almost no keywords.
Meaning is the unit of retrieval. If your memory layer only matches keywords, your agents will miss the lessons that matter most.
Hybrid retrieval pipeline
A query in Glenvs flows through a layered pipeline designed for both precision and speed:
- Pre-filter: structured metadata narrows the candidate set (e.g., this customer, this domain).
- Vector recall: approximate nearest-neighbor search finds the most semantically relevant candidates.
- Re-rank: a precision pass reorders results by relevance, recency, and proven reliability.
- Assembly: the top memories are packaged into a compact, ready-to-use context for the agent.
Engineering for sub-millisecond reads
At fleet scale, agents query memory constantly, so read latency is a first-class concern. We keep hot vectors in memory-optimized indexes, shard the store horizontally so it scales out rather than up, and cache frequently retrieved clusters close to where agents run. The result is consistent, sub-millisecond recall even as the store grows into the millions of entries.
Keeping memory fresh
A memory layer that only grows eventually becomes noisy. Glenvs applies lifecycle policies: memories carry confidence and recency signals, stale or superseded entries are demoted, and conflicting memories are reconciled so agents act on the most reliable version of the truth. The store stays dense with signal, not clutter.
Key takeaways
- Memories are stored as structured records plus semantic embeddings.
- Semantic search retrieves by meaning, not keyword overlap.
- A hybrid pipeline (filter → recall → re-rank → assemble) balances precision and speed.
- Sharding, caching, and lifecycle policies keep reads fast and the store clean.
Why architecture is the moat
Anyone can stash text in a database. The hard part is making the right memory surface at the right millisecond, for the right agent, every time. That reliability is what turns shared memory from a nice idea into infrastructure teams can build on. It's the part of Glenvs we're proudest of — and the part we'll keep pushing hardest.