Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreie2qe3e7ny5igwqnzuakhtmj6gtlcylucx2oa54zlcfszki4glhsu",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mldr5w7j33m2"
  },
  "path": "/t/they-said-unquantized-local-ai-was-impossible-on-budget-phones-we-got-a-2-3gb-fp32-model-running-locally-on-a-120-galaxy-a25-cpu-no-gpu-no-npu-uses-less-ram-than-chrome/175739#post_6",
  "publishedAt": "2026-05-08T11:45:48.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "Yeah, the problem is this is not a Transformers completely rather custom version of it. We re-wrote the whole stack together it works more like bio brain than AI. It has different math for multiplications and creating the matrix, so its whole new architecture behind, we will call it Post - Transformers or Synthetic Neural Engine. Its not the same thing\n\nYeah we runed also 7B on ordinary 2GB graphic card with almost 4k tokens, but still experimenting as a start up we are in early phase.",
  "title": "They said unquantized local AI was impossible on budget phones. We got a 2.3GB FP32 model running locally on a €120 Galaxy A25 CPU. No GPU, no NPU, uses less RAM than Chrome"
}