External Publication
Visit Post

Looking for CPU compute grant — FlashLM, ternary CPU-only language model

Hugging Face Forums [Unofficial] February 18, 2026
Source
Hi everyone. I’m building FlashLM, an open-source ternary (1.58-bit) language model designed to run entirely on CPU — inference is pure add/sub, no float multiply. We just validated a v5 architecture that scores 88% on associative recall benchmarks vs 3% for v4. Now we need CPU compute (ideally EPYC or Xeon with large L3 cache) to train on real data and get BPC numbers. So far everything has been trained on Deepnote free tier (2 core CPU, 5GB RAM). Repo: https://github.com/changcheng967/FlashLM Model & weights: https://huggingface.co/changcheng967/flashlm-v4-bolt Anyone know if HuggingFace offers CPU compute grants, or have suggestions for where to get donated CPU time for open-source research?

Discussion in the ATmosphere

Loading comments...