{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreicnfnnu7obiyjfvjiehmxmy5ojskbcshy3ryerfgeiu2btpr343ta",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mfszqe3s7y72"
},
"path": "/t/crma-drop-in-adapter-for-fine-tuning-continual-learning-zero-catastrophic-forgetting-at-7b-scale/173818#post_1",
"publishedAt": "2026-02-27T01:07:11.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"CRMA Fine-Tuner & Continual Learning API - Swagger UI"
],
"textContent": "I built CRMA (Constrained Residual Mixing Adapter) — a small adapter that attaches to every layer of a language model\nduring fine-tuning. It applies a mathematical constraint that keeps training stable: the model learns new information\nbut can’t overwrite what it already knows.\n\nFine-tuning results (Mistral-7B):\n\n * CRMA holdout loss: 0.1426 vs standard LoRA: 0.1519 (-6.1% improvement)\n * Peak gradient norm reduced 39-84% across 3 independent runs\n * Tested on TinyLlama-1.1B, Mistral-7B-v0.3, Gemma-2-2b-it\n\n\n\nContinual learning results (4 domains sequentially: Medical, Legal, Code, Finance):\n\n * CRMA modular drift: -0.1% (model actually slightly improves on earlier domains)\n * Standard sequential fine-tuning forgetting: +351.4%\n * That’s a 3,500x reduction in catastrophic forgetting\n * No replay buffers, no knowledge distillation, no frozen teacher copy, no extra compute\n\n\n\nHow it compares:\n\n┌──────────┬────────┬────────┐\n│ Method │ Forget │ Needs │\n├──────────┼────────┼────────┤\n│ EWC │ +58% │ Replay │\n├──────────┼────────┼────────┤\n│ SDFT │ -0.1pt │ 2x inf │\n├──────────┼────────┼────────┤\n│ O-LoRA │ Less │ Track │\n├──────────┼────────┼────────┤\n│ Adaption │ N/A │ $50M │\n├──────────┼────────┼────────┤\n│ CRMA │ -0.1% │ None │\n└──────────┴────────┴────────┘\n\nAPI is live and testable right now. Free tier available (3 runs/day, TinyLlama). Usage-based pricing for larger\nmodels.\n\nAPI: CRMA Fine-Tuner & Continual Learning API - Swagger UI\n\nFull technical report (with methodology and ablation history) available on request. Happy to answer questions.\n\n— Kiran",
"title": "CRMA: Drop-in adapter for fine-tuning + continual learning — zero catastrophic forgetting at 7B scale"
}