External Publication

Thinking model recomendation for core ultra 5 135u

Hugging Face Forums [Unofficial] April 16, 2026

Hi very new to setting up AI. This will be my first attempt.

I would like any advice on what will work well on an intel core ultra 5 135u with 32 gig ram For reasoning, from what I managed to research it seems like DeepSeek-R1-Distill-Qwen-14B should work well

I dont know which is the better implementation.

IPEX-LLM (SYCL): llama.cpp using CPU + iGPU hybrid offloading

OpenVINO: llama.cpp backend attempting GPU + NPU load-splitting.

Currently, is the OpenVINO NPU-offload mature enough for 14B models, or should I stick to the IPEX-LLM iGPU.

I am also using windows voice access. In task manager it looks like windows is using the gpu to implement it

Any general advice at all since this will be my first attempt

Thanks

Discussion in the ATmosphere