undefined | Better HN

0 pointseightysixfour2y ago0 comments

Every small model that has outperformed GPT-4 has proven to be an overfit, so I would say it is the obvious claim, and any claim opposite that is what we should be skeptical of.

0 comments

anon3738392y ago

With the exception of task specialization. Fine-tuning a small model such as Mistral 7B on a specific set of tasks can outperform using GPT-4 on those tasks, and with cheaper and faster inference.

eightysixfourOP2y ago

Not on the leaderboards mentioned here. That’s my point, you can overfit for specific tasks, you can’t beat them on multi-task leaderboards without training on the test data.

j / k navigate · click thread line to collapse

0 comments

anon3738392y ago

With the exception of task specialization. Fine-tuning a small model such as Mistral 7B on a specific set of tasks can outperform using GPT-4 on those tasks, and with cheaper and faster inference.

eightysixfourOP2y ago

Not on the leaderboards mentioned here. That’s my point, you can overfit for specific tasks, you can’t beat them on multi-task leaderboards without training on the test data.

j / k navigate · click thread line to collapse