undefined | Better HN

0 pointsregexorcist4d ago0 comments

Curious if you tested llama.cpp and still found oMLX faster? I haven't tried the latter myself, might give it a go.

0 comments

Oh yeah I did test various solutions and different settings and quants

Llama is about 1/3 slower on Apple Silicon.

j / k navigate · click thread line to collapse

Oh yeah I did test various solutions and different settings and quants

Llama is about 1/3 slower on Apple Silicon.

j / k navigate · click thread line to collapse