undefined | Better HN

0 pointszozbot2342y ago0 comments

Overfitting is quite unlikely with a smaller model though. Model parsimony provides a kind of regularization "for free", in fact with the extra benefit of saving on compute costs.

0 comments

minimaxir2y ago

The dirty secret behind modern selfsupervised training is that no one cares about a test/validation dataset anymore.

dadeyemi2y ago

does overfitting even matter if your dataset is large enough?

WhitneyLand2y ago

I think a lot of it depends on what you mean by “large enough”.

In principle, a data set could be infinitely large in size, but not cover little edge cases here and there due to repetition. So you might be OK if you had infinite size and infinite diversity.

Even if you had very large finite data, let’s say all language ever conceived by mankind… The second you finish training, what your overfit model knows is locked in.

The world as we know it would continue to generate vast amounts of new data that you might not be able to generalize to.

j / k navigate · click thread line to collapse