undefined | Better HN

0 pointsbagels3y ago0 comments

How low? I think everybody has different requirements there.

0 comments

I ran it on a modern desktop and was getting sub 1 token/s

asah3y ago

could it parallelize across multiple PCs ?

No since it’s stateful in the sense that inferencing is dependent on the past generated tokens.

I didn't measure, but IIRC it was lower than 1 token/sec

j / k navigate · click thread line to collapse

I ran it on a modern desktop and was getting sub 1 token/s

asah3y ago

could it parallelize across multiple PCs ?

No since it’s stateful in the sense that inferencing is dependent on the past generated tokens.

I didn't measure, but IIRC it was lower than 1 token/sec

j / k navigate · click thread line to collapse