undefined | Better HN

0 pointsnickpsecurity11mo ago0 comments

Don't the parallelizing techniques of a 4x build make using them more difficult than a 1x build with no extra parallelism? Couldn't the 32GB 4090 handle more models in their original configurations?

0 comments

ijk11mo ago

For LLM inference parallel GPUs is mostly fine (you take some performance hit but llama.cpp doesn't care what cards you use and other stuff handles 4 symmetric GPUs just fine). You get more problems when you're doing anything training related, though.

zargon11mo ago

> Don't the parallelizing techniques of a 4x build make using them more difficult than a 1x build with no extra parallelism?

For inference, no. For training, only slightly.

j / k navigate · click thread line to collapse

0 comments

ijk11mo ago

zargon11mo ago

> Don't the parallelizing techniques of a 4x build make using them more difficult than a 1x build with no extra parallelism?

For inference, no. For training, only slightly.

j / k navigate · click thread line to collapse