undefined | Better HN

0 pointscornel_io2y ago0 comments

Most human authors are frankly far too stupid to be worth reading, even if they do put care into their work.

This, IMO, is the actual biggest problem with LLMs training on whatever the biggest text corpus us that's available: they don't account for the fact that not all text is equally worthy of next-token-predicting. This problem is completely solvable, almost trivially so, but I haven't seen anyone publicly describe a (scaled, in production) solution yet.

0 comments

mistermann2y ago

> This problem is completely solvable, almost trivially so, but I haven't seen anyone publicly describe a (scaled, in production) solution yet.

Can you explain your solution?

pcthrowaway2y ago

I imagine it looks something like "Censor all writing that contradicts my worldview"

Ma8ee2y ago

It hardly matters what sources you are using if you filter it through something that has less understanding than a two year old, if any, no matter how eloquent it can express itself.

j / k navigate · click thread line to collapse