undefined | Better HN

0 pointskamikazeturtles1y ago0 comments

> I think because they are trained on Claude/O1, they tend to have comparable performance.

Why does having comparable performance indicate having been trained on a preexisting model's output?

I read a similar claim in relation to another model in the past, so I'm just curious how this works technically.

0 comments

because the valley is burning money and GPUs training these and somebody else comes out with another model for a tiny fraction of cost it's an easy assumption to make it was trained on synthetic data

j / k navigate · click thread line to collapse

0 pointskamikazeturtles1y ago0 comments

> I think because they are trained on Claude/O1, they tend to have comparable performance.

Why does having comparable performance indicate having been trained on a preexisting model's output?

I read a similar claim in relation to another model in the past, so I'm just curious how this works technically.

0 comments

wordpad251y ago

because the valley is burning money and GPUs training these and somebody else comes out with another model for a tiny fraction of cost it's an easy assumption to make it was trained on synthetic data

j / k navigate · click thread line to collapse