undefined | Better HN

0 pointsesafak10mo ago0 comments

...because stochastic methods are implicit regularizers, leading to solutions that generalize better. Let's spell it out for those that don't know.

https://www.inference.vc/notes-on-the-origin-of-implicit-reg...

0 comments

jampekka10mo ago

OLS is a convex optimization problem, so this doesn't really apply. And for statistical analysis you really don't want to add poorly understood artificial noise to the parameter estimates anyway.

esafakOP10mo ago

In general you do, because the unbiased estimates have higher generalization error. You are already dealing with sampling noise. I am not an expert in optimization, and what "poorly understood" means to you, but I know there is quite some research on the properties of SGD noise; e.g., https://francisbach.com/rethinking-sgd-noise/

Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning https://arxiv.org/abs/2301.13703

j / k navigate · click thread line to collapse