undefined | Better HN

0 pointsRugnirViking3y ago0 comments

"GPT will repeat the moves it feels to be most familiar to a given game state"

That's where temprature comes in. AI that parrots the highest probability output every time tends to be very boring and stilted. When we instead select randomly from all possible responses weighted by their probability we get more interesting behavior.

GPT also doesn't only respond based on examples it has already seen - that would be a markov chain. It turns out that even with trillions of words in a dataset, once you have 10 or so words in a row you will usually already be in a region that doesn't appear in the dataset at all. Instead the whole reason we have an AI here is so it learns to actually predict a response to this novel input based on higher-level rules that it has discovered.

I don't know how this relates to the discussion you were having but I felt like this is useful & interesting info

0 comments

thomastjeffery3y ago

> GPT also doesn't only respond based on examples it has already seen - that would be a markov chain

The difference between GPT and a Markov chain is that GPT is finding more interesting patterns to repeat. It's still only working with "examples it has seen": the difference is that it is "seeing" more perspectives than a Markov chain could.

It still can only repeat the content it has seen. A unique prompt will have GPT construct that repetition in a way that follows less obvious patterns: something a Markov chain cannot accomplish.

The less obvious patterns are your "higher level rules". GPT doesn't see them as "rules", though. It just sees another pattern of tokens.

I was being very specific when I said, "GPT will repeat the moves it feels to be most familiar to a given game state."

The familiarity I'm talking about here is between the game state modeled in the prompt and the game states (and progressions) in GPT's model. Familiarity is defined implicitly by every pattern GPT can see.

GPT adds the prompt itself into its training corpus, and models it. By doing so, it finds a "place" (semantically) in its model where the prompt "belongs". It then finds the most familiar pattern of game state progression when starting at that position in the model.

Because there are complex patterns that GPT has implicitly modeled, the path GPT takes through its model can be just as complex. GPT is still doing no more than blindly following a pattern, but the complexity of the pattern itself "emerges" as "behavior".

Anything else that is done to seed divergent behavior (like the temperature alteration you mentioned) is also a source of "emergent behavior". This is still not part of the behavior of GPT itself: it's the behavior of humans making more interesting input for GPT to model.

j / k navigate · click thread line to collapse