Thinking model recomendation for core ultra 5 135u
Hugging Face Forums [Unofficial]
April 16, 2026
Hi very new to setting up AI. This will be my first attempt.
I would like any advice on what will work well on an intel core ultra 5 135u with 32 gig ram For reasoning, from what I managed to research it seems like DeepSeek-R1-Distill-Qwen-14B should work well
I dont know which is the better implementation.
IPEX-LLM (SYCL): llama.cpp using CPU + iGPU hybrid offloading
OpenVINO: llama.cpp backend attempting GPU + NPU load-splitting.
Currently, is the OpenVINO NPU-offload mature enough for 14B models, or should I stick to the IPEX-LLM iGPU.
I am also using windows voice access. In task manager it looks like windows is using the gpu to implement it
Any general advice at all since this will be my first attempt
Thanks
Discussion in the ATmosphere