undefined | Better HN

0 pointssterlind3y ago0 comments

it seems unlikely to me that ChatGPT is directly trained on chat data. if it is, we should see it know information past its knowledge cutoff. afaik that hasn't happened.

I assume the chat logs are instead training a reward model, which itself is then used as the reward function during RLHF training.

0 comments

bigyikes3y ago

These models have a very long lead time before they’re released to the public. Maybe GPT-5 is being trained on ChatGPT logs. I’m not sure we’d be able to detect if this was happening.

j / k navigate · click thread line to collapse

0 comments

bigyikes3y ago

These models have a very long lead time before they’re released to the public. Maybe GPT-5 is being trained on ChatGPT logs. I’m not sure we’d be able to detect if this was happening.

j / k navigate · click thread line to collapse