undefined | Better HN

0 pointsdavidguetta2y ago0 comments

Generalize has a tendency to imply you can extrapolate. And in most case it's actually the opposite that happens: neural nets tend to COMPRESS the data. (which in turn is a good thing in many case because the data is noisy)

0 comments

3cats-in-a-coat2y ago

The point of compression is to decompress after. That's what happens during inference, and when the extrapolation occurs.

Let's say I tell GPT "write 8 times foobar". Will it? Well then it understands me and can extrapolate from the request to the proper response, without having specifically "write 8 times foobar" in its model.

Most decompression algorithms focus on predicting the next token (byte, term, etc.), believe it or not. The more accurately they predict the next token, the less information you need to store to correct misprediction.

j / k navigate · click thread line to collapse

0 pointsdavidguetta2y ago0 comments

0 comments

3cats-in-a-coat2y ago

The point of compression is to decompress after. That's what happens during inference, and when the extrapolation occurs.

j / k navigate · click thread line to collapse