Semantic Bundle AI: A Complementary Layer for LLMs — 91.7% Memory Reduction, 38.6% Drift Reduction, Zero Retraining
Hi HF community,
I’d like to share two preprints on Semantic Bundle AI , a drop-in complementary layer for existing LLMs that addresses three structural problems: semantic drift, difficulty of targeted edits, and memory overhead.
The problem
When you update a concept in an LLM’s embedding space:
- The change drifts across unrelated concepts (semantic drift)
- You can’t surgically edit one concept without contaminating others
- Storing full embeddings at scale is expensive
The approach
Semantic Bundle AI sits on top of existing LLMs — no architectural changes, no retraining required.
- Anchor coordinates : stable reference frames that resist drift
- Semantic bundles : structured concept representations with controlled update locality
- Sparse reconstruction : compress stored embeddings via bundle-based reconstruction
PoC results (4 experiments)
| Metric | Result |
|---|---|
| Memory reduction (K=64) | 91.7% (45.0 KB → 3.8 KB) |
| Reconstruction similarity | 0.963 |
| Cumulative drift reduction | 38.6% |
| Edit contamination rate | 32.6% of baseline (at ρ=0.1) |
Zero retraining. Zero architectural modifications.
Papers & code
Zenodo (Paper 0 + Paper 1): Search results Code: GitHub - msaitou-glitch/Semantic-Bundle-AI: Official repository for the "Meaning Bundle AI" project. Complementary Layer to LLMs using Stable Coordinate Systems. · GitHub
Limitations (honest)
- Small-scale controlled datasets (15–110 sentences, single domain)
- Stability–ranking tradeoff identified (anchor coordinates improve cluster stability but not ranking consistency)
- Not yet validated at production scale
- Paper 1 under review at SSRN
Looking for critiques, failure cases, and adjacent work. Happy to discuss.
Discussion in the ATmosphere