External Publication

Zero Forgetting Across 4 Benchmarks on Mistral-7B — Interactive Results Dashboard

Hugging Face Forums [Unofficial] March 11, 2026

We’ve been working on continual learning for LLM fine-tuning — training one model sequentially across multiple domains without catastrophic forgetting. After 6 months of R&D and 50+ failed experiments (EWC, replay, KD, gradient projection), we have a method that works.

4 independent benchmarks on Mistral-7B:

Research benchmark (5 domains, 3 seeds) — -0.17% drift vs +43% forgetting with naive LoRA
Walmart enterprise (4 domains) — BERTScores 0.82–0.94 across all domains retained
Salesforce enterprise (5 domains) — Positive backward transfer: retention BERTScores improved with each new domain (0.889 → 0.907)
Dental stress test (8 domains, 2 seeds) — Gradient norms stable throughout, zero crashes

Spectral norm locked at 1.0 across every experiment. Standard LoRA crashed at step 43 with gradient norm 263. Ours: peak under 6. No replay buffers, no EWC, no knowledge distillation.

The adapter is ~0.1% additional parameters, works with any LoRA/QLoRA setup.

Interactive benchmark dashboard with charts:

huggingface.co

Zero Forgetting Benchmarks - a Hugging Face Space by Fourwheels2512

Zero forgetting in LLM fine-tuning — 4 benchmarks

Live product (free tier, no credit card): https://mhc-finetune-saas-zrtokzlkbnue9zsk7jfgad.streamlit.app

US patent pending. Would love to hear from anyone working on continual learning or dealing with forgetting in multi-domain fine-tuning.

Zero Forgetting Benchmarks - a Hugging Face Space by Fourwheels2512

Discussion in the ATmosphere