undefined | Better HN

0 pointsblackeyeblitzar1y ago0 comments

Isn’t “sentence prediction” roughly the same as multi token prediction of sufficient length? In the end are we just talking about a change to hyper parameters or maybe a new hyper parameter that controls the granularity of “prediction length”?

0 comments

mdp20211y ago

> multi token prediction of sufficient length

Is multi token prediction the same as predicting the embedding of a complex token (the articulation of those input tokens in a sentence)?

blackeyeblitzarOP1y ago

To be honest I don’t know. Maybe the only way to know is to build and measure all these variations.

j / k navigate · click thread line to collapse

0 pointsblackeyeblitzar1y ago0 comments

0 comments

mdp20211y ago

> multi token prediction of sufficient length

Is multi token prediction the same as predicting the embedding of a complex token (the articulation of those input tokens in a sentence)?

blackeyeblitzarOP1y ago

To be honest I don’t know. Maybe the only way to know is to build and measure all these variations.

j / k navigate · click thread line to collapse