Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
bagels
2y ago
0 comments
Share
How low? I think everybody has different requirements there.
0 comments
default
newest
oldest
extasia
2y ago
I ran it on a modern desktop and was getting sub 1 token/s
asah
2y ago
could it parallelize across multiple PCs ?
serialx
2y ago
No since it’s stateful in the sense that inferencing is dependent on the past generated tokens.
3 more replies
tarruda
2y ago
I didn't measure, but IIRC it was lower than 1 token/sec
j
/
k
navigate · click thread line to collapse