External Publication

Building Local: My 2026 Headless AI Server Journey

Hugging Face Forums [Unofficial] April 17, 2026

> What are you all currently running on your local setups? Hmm… I use only small embedding models every day. I’ve integrated them into my work scripts. Since my GPU isn’t very powerful (a 3060 Ti with 8 GB of memory), I don’t really use very large models often locally… That said, I’ve heard that if you use MoE LLMs via GGUF on platforms like Ollama or LM Studio, they run smoothly even with just within 32GB of RAM (not VRAM)… Personally, since most of my current use cases don’t require confidentiality, I just use cloud services for my LLMs. Of course, I often try out models (LLM, T2I, etc.) hosted on HF via Spaces.

Discussion in the ATmosphere