Zero Forgetting Across 4 Benchmarks on Mistral-7B — Interactive Results Dashboard
We’ve been working on continual learning for LLM fine-tuning — training one model sequentially across multiple domains without catastrophic forgetting. After 6 months of R&D and 50+ failed experiments (EWC, replay, KD, gradient projection), we have a method that works.
4 independent benchmarks on Mistral-7B:
- Research benchmark (5 domains, 3 seeds) — -0.17% drift vs +43% forgetting with naive LoRA
- Walmart enterprise (4 domains) — BERTScores 0.82–0.94 across all domains retained
- Salesforce enterprise (5 domains) — Positive backward transfer: retention BERTScores improved with each new domain (0.889 → 0.907)
- Dental stress test (8 domains, 2 seeds) — Gradient norms stable throughout, zero crashes
Spectral norm locked at 1.0 across every experiment. Standard LoRA crashed at step 43 with gradient norm 263. Ours: peak under 6. No replay buffers, no EWC, no knowledge distillation.
The adapter is ~0.1% additional parameters, works with any LoRA/QLoRA setup.
Interactive benchmark dashboard with charts:
huggingface.co
Zero Forgetting Benchmarks - a Hugging Face Space by Fourwheels2512
Zero forgetting in LLM fine-tuning — 4 benchmarks
Live product (free tier, no credit card): https://mhc-finetune-saas-zrtokzlkbnue9zsk7jfgad.streamlit.app
US patent pending. Would love to hear from anyone working on continual learning or dealing with forgetting in multi-domain fine-tuning.
Discussion in the ATmosphere