undefined | Better HN

0 pointsenergy1239mo ago0 comments

I would not bet against synthetic data. AlphaZero is trained only on synthetic data and it's better than any human, and keeps getting better with more training compute. There is no negative feedback loop in the narrow cases we have tried previously. There may be trade-offs but on net we are going forward.

0 comments

csande179mo ago

There's a pretty big difference between AlphaZero and a "generative AI" program: AlphaZero has access to an oracle that can tell it whether it's making valid moves and winning games.

By comparison, getting accurate feedback on whether facts are correct in a piece of text (for example) is much more difficult and expensive. At least, presumably that's why AI companies publish staged demo videos where the AI still makes factual errors half the time.

energy123OP9mo ago

Automatic verification (oracle) is being used today to create synthetic data for LLMs. I don't see it as a big difference versus AlphaZero. While there's no way to ensure that a single synthetic reasoning trace is correct, as long as it leads to the correct answer according to the verifier, the law of large numbers should take care of that.

The problem is that it's difficult to create verifiers for many things we care about like architectural taste. So I expect to see superhuman capabilities on the things we can make verifiers for, but for other things it's harder to predict. We may see transfer learning or we may see collapse. My money would be more on transfer learning.

daveguy9mo ago

Transfer learning is one of the biggest unsolved problems in AI. And we are nowhere near solving it or even understanding how to go about it from an algorithmic perspective. We will definitely see collapse of the current hype train before we understand and employ effective transfer learning.