undefined | Better HN

0 pointsjph002y ago0 comments

AFAIK re-warming it up and then gradually decreasing it again ought to work fine. Have you seen any research showing that it doesn't?

0 comments

fpgaminer2y ago

That would work, in that it would allow one to continue decreasing the loss, but I wouldn't say that it would work "fine". A model trained with restarts always performs worse than a model trained for the same duration without restarts.

two_in_one2y ago

> A model trained with restarts always performs worse than a model trained for the same duration without restarts.

Citation would be nice. From my experience restart sometimes is required. When model gets unstable and 'explodes', or gets stuck in some local minima. This is common with GANs. I usually rollback the model a bit, but keep the latest discriminator. So that discriminator 'knows' what to expect. It works in most cases, except for the 'fatality', when model blows up no matter what. That's the end of training.

jph00OP2y ago

I haven't seen any researcher that supports your contention. SGDR (SGD with restarts) has been shown to work well. https://arxiv.org/abs/1608.03983

j / k navigate · click thread line to collapse