undefined | Better HN

0 pointspbhjpbhj2y ago0 comments

>harvesting closed captions out of YouTube videos

I'd bet a lot of YouTubers are using LLMs to write and/or edit content. So we pass that through a human presentation. Then introduce some errors in the form of transcription. Turn feed the output in as part of a training corpus ... we plateaued real quick.

It seems like it's hard to get past a level of human intelligence at which there's a large enough corpus of training data or trainers?

Anyone know of any papers on breaking this limit to push machine learning models to super-human intelligence levels?

0 comments

pixl972y ago

If a model is average human intelligence in pretty much everything, is that super-human or not? Simply put, we as individuals aren't average at everything, we have what we're good at and a great many things we're not. We average out by looking at broad population trends. That's why most of us in the modern age spend a lot of time on specialization for whatever we work in. Which brings the likely next place for data. A Manna (the story) like data collection program where companies hoover up everything they can on their above average employees till we're to the point most models are well above the human average in most categories.

j / k navigate · click thread line to collapse

0 comments

pixl972y ago

j / k navigate · click thread line to collapse