I may be a bit behind the times, but I'm also mystified by "deep learning's" popularity. Both giant neural nets and kernel methods have overfitting problems: torture a billion-parameter model long enough, and it will tell you what you want to hear.
SVMs address this by finding a large margin for error, which will hopefully improve generalization. DNNs (I think) do this by throwing more ("big") data at the problem and hoping that the training set covers all possible inputs. Work on adversarial learning suggests that DNNs go completely off the rails when presented with anything slightly unexpected.