undefined | Better HN

0 pointsGeekyBear8mo ago0 comments

There is no NPU "standard".

Llama.cpp would have to target every hardware vendor's NPU individually and those NPUs tend to have breaking changes when newer generations of hardware are released.

Even Nvidia GPUs often have breaking changes moving from one generation to the next.

0 comments

montebicyclelo8mo ago

I think OP is suggesting that Apple / AMD / Intel do the work of integrating their NPUs into popular libraries like `llama.cpp`. Which might make sense. My impression is that by the time the vendors support a certain model with their NPUs the model is too old and nobody cares anyway. Whereas llama.cpp keeps up with the latest and greatest.

j / k navigate · click thread line to collapse

0 comments

montebicyclelo8mo ago

j / k navigate · click thread line to collapse