undefined | Better HN

0 pointszmmmmm19d ago0 comments

nice ... i think i get the idea - it's effectively the same / similar benefit as batching, but you're batching against your own speculated future path. Which would be pointless if you didn't have a high probability path to evaluate against - but the draft gives you that.

0 comments

esyir19d ago

I'll add an expansion here. It's more useful to you locally, as you have excess compute that's generally wasted. If you're serving multiple user and trying to max output, you might cost some in this case

nullc16d ago

An obvious thing to do is that if you have enough concurrent batches to max out performance you should use those and not speculate. But if compute would be idle waiting on memory, fill the excess with speculation.

j / k navigate · click thread line to collapse

0 comments

esyir19d ago

nullc16d ago

j / k navigate · click thread line to collapse