undefined | Better HN

0 pointsalbertgoeswoof3y ago0 comments

How could you prove this?

0 comments

People have shown GPT has an internal model of the state of a game of Othello:

Https://arxiv.org/abs/2210.13382

pja3y ago

More accurately: a GPT derived DNN that’s been specifically trained (or fine-tuned, if you want to use OpenAI’s language) on a dataset of Othello games ends up with an internal model of an Othello board.

It looks like OpenAI have specifically added Othello game handling to chat.openai.org, so I guess they’ve done the same fine-tuning to ChatGPT? It would be interesting to know how good an untuned GPT3/4 was at Othello & whether OpenAI has fine-tuned it or not!

(Having just tried a few moves, it looks like ChatGPT is just as bad at Othello as it was at chess, so it’s interesting that it knows the initial board layout but can’t actually play any moves correctly: Every updated board it prints out is completely wrong.)

WoodenChair3y ago

> it’s interesting that it knows the initial board layout

Why is that interesting? The initial board layout would appear all the time in the training data.

1 more reply

thomastjeffery3y ago

The state of the game, not the behavior of playing it intentionally. There is a world of difference between the two.

It was able to model the chronological series of game states that it read from an example game. It was able to include the arbitrary "new game state" of a prompt into that model, then extrapolate that "new game state" into "a new series of game states".

All of the logic and intentions involved in playing the example game were saved into that series of game states. By implicitly modeling a correctly played game, you can implicitly generate a valid continuation for any arbitrary game state; at least with a relatively high success rate.

LeanderK3y ago

As I see it, we do not really know much about how GPT does it. The approximations can be very universal so we do not really know what is computed. I take very much issue with people dismissing it as "pattern matching", "being close to the training data", because in order to generalise we try to learn the most general rules and through increasing complexity we learn the most general, simple computations (for some kind of simple and general).

But we have fundamental, mathematical bounds on the LLM. We know that the complexity is at most O(n^2) in token length n, probably closer to O(n). It can not "think" about a problem and recurse into simulating games. It can not simulate. It's an interesting frontier, especially because we have also cool results about the theoretical, universal approximation capabilities of RNNs.

1 more reply

calf3y ago

So does AlphaGo has an internal model of Go's game theoretic structures, but nobody was asserting AlphaGo understands Go. Just because English is not specifiable does not give people an excuse to say the same model of computation, a neural network, "understands" English any more than a traditional or neural algorithm for Go understands Go.

valine3y ago

Just spitballing, I think you’d need a benchmark that contains novel logic puzzles, not contained in the training set, that don’t resemble any existing logic puzzles.

The problem with the goat question is that the model is falling back on memorized answers. If the model is in fact capable of cognition, you’d have better odds of triggering the ability with problems that are dissimilar to anything in the training set.

henry20233y ago

Maybe Sudokus? Sudokus are np-complete and getting the "pattern" right is equivalent to abstracting the rules and solving the problem

j / k navigate · click thread line to collapse

0 comments

fancyfredbot3y ago

People have shown GPT has an internal model of the state of a game of Othello:

Https://arxiv.org/abs/2210.13382

pja3y ago

WoodenChair3y ago

> it’s interesting that it knows the initial board layout

Why is that interesting? The initial board layout would appear all the time in the training data.

1 more reply

thomastjeffery3y ago

The state of the game, not the behavior of playing it intentionally. There is a world of difference between the two.

LeanderK3y ago

1 more reply

calf3y ago

valine3y ago

Just spitballing, I think you’d need a benchmark that contains novel logic puzzles, not contained in the training set, that don’t resemble any existing logic puzzles.

henry20233y ago

Maybe Sudokus? Sudokus are np-complete and getting the "pattern" right is equivalent to abstracting the rules and solving the problem

j / k navigate · click thread line to collapse