{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreifsdxcsdnefemcybl5khhfs2tbz645a7pcxca2rc6jqnmb6s2mqre",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3miofmv6k4z42"
},
"path": "/t/scaling-agentic-memory-to-5-billion-vectors-via-binary-quantization-and-dynamic-wavelet-matrices/174951#post_1",
"publishedAt": "2026-04-04T12:48:42.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"arXiv.org",
"Hippocampus: An Efficient and Scalable Memory Module for Agentic AI"
],
"textContent": "In a study, a new “dynamic wavelet matrix” was used as a vector database, where the memory grows only with log(σ) instead of with n. I considered building a KNN model with a huge memory, capable of holding, for example, 5 billion vectors.\n\nFirst, the words in the context window are converted into an embedding using deberta-v3-small. This is a fast encoder that also takes the position of the tokens into account (disentangled attention) and is responsible for the context in the model.\n\nThe embedding is then converted into a bit sequence using binary quantization, where dimensions greater than 0 are converted to 1 and otherwise to 0.\n\nThe advantage is that bit sequences are compressible and are entered into the dynamic wavelet matrix, where the memory grows only with log(σ). A response token is added to each element as its content.\n\nDuring response generation, the context window is converted into an embed and compared to the elements in the matrix using a Hemming-Ball distance. The response token from the element with the smallest distance is added to the context window, and the process iterates several times until the response is long enough.\n\narXiv.org\n\n### Hippocampus: An Efficient and Scalable Memory Module for Agentic AI\n\nAgentic AI require persistent memory to store user-specific histories beyond the limited context window of LLMs. Existing memory systems use dense vector databases or knowledge-graph traversal (or hybrid), incurring high retrieval latency and poor...",
"title": "Scaling Agentic Memory to 5 Billion Vectors via Binary Quantization and Dynamic Wavelet Matrices"
}