undefined | Better HN

0 pointsmckirk2y ago0 comments

The problem is that these models do not have any working memory they could use to carry out such tasks, which are on a meta-level when seen from a language perspective. They can only go with their 'gut instinct' for selecting the next word, they can't 'consider and ponder the problem internally' first.

0 comments

sp3322y ago

The problem is that the input is tokenized before the model gets it as input. It does not see the individual letters "t" + "o". It gets one single token, #1462. The word "toe" is another single token, #44579. Maybe over time it could learn from context that inputs that start with #44579 also satisfy the constraint of starting with #1462, but that's a lot of work and it's not going to happen for all combinations of letters.

jameslevy2y ago

Perhaps prompting the model to first describe its approach to answering the question. This type of chain-of-thought technique can yield better results.

j / k navigate · click thread line to collapse

0 pointsmckirk2y ago0 comments

0 comments

sp3322y ago

jameslevy2y ago

Perhaps prompting the model to first describe its approach to answering the question. This type of chain-of-thought technique can yield better results.

j / k navigate · click thread line to collapse