They said unquantized local AI was impossible on budget phones. We got a 2.3GB FP32 model running locally on a €120 Galaxy A25 CPU. No GPU, no NPU, uses less RAM than Chrome
Hugging Face Forums [Unofficial]
May 8, 2026
Yeah, the problem is this is not a Transformers completely rather custom version of it. We re-wrote the whole stack together it works more like bio brain than AI. It has different math for multiplications and creating the matrix, so its whole new architecture behind, we will call it Post - Transformers or Synthetic Neural Engine. Its not the same thing
Yeah we runed also 7B on ordinary 2GB graphic card with almost 4k tokens, but still experimenting as a start up we are in early phase.
Discussion in the ATmosphere