(Make no mistake, I can fully understand them, professors paid 80k per year, lacking resources, fighting bureaucrats, it is a great thing that they are recognised and at last paid what they deserve for devoting their lives to science.)
- Vapnik is joining a number of people he previously worked with
- Getting huge computational resources and seeing your ideas applied to real data is rewarding
Edit: "No way" is inaccurate. I should have said it is much easier to do at these companies. Also it is inaccurate to imply this is the only reason these great minds have joined these companies.
I don't see many details here, are you sure that's the case?
There are other reasons a giant of the field might decide to work at Facebook. They might give him more freedom than his previous employer. Perhaps friends of his already work at Facebook. The location and compensation may also play into it.
I don't want to be skeptical for no reason, but you're championing a popular narrative which I don't see direct support for in this instance.
Vapnik is a big theory guy. Though I am not sure he has done anything of big practical importance recently, his immense contribution to ML (the SVM) was done at a time when machines were many orders of magnitudes weaker than they are now.
Complex theories do not work, simple algorithms do.
"One of the goals of this book is to show that, at least in the problems of statistical inference, this is not true. I would like to demonstrate that in this area of science a good old principle is valid: Nothing is more practical than a good theory.
-- From Vapnik's preface to The Nature of Statistical Learning Theory*
Vapnik is not well-described as a "theory guy". That implies that he's not interested in connections between theory and practice, and this is most profoundly not the case. He has arguably been the most successful ML researcher ever as far as connecting abstract theory to real-world outcomes.
Besides the SVM: the VC dimension started out as a lemma regarding set counting, and he pushed it to the surprising (even shocking) conclusion of universal consistency for very general classes of estimators.
I haven't seen this paper before (thanks!!). How different is it to Word2Vec?
Clearly the pre-trained vectors at that scale (and much bigger than the ones released with Word2Vec) are new and very exciting.
There's a massive dearth of data in academia. This is also why you see people like Kleinberg working directly with facebook on network research.
http://blog.bitops.com/blog/2014/06/26/first-steps-for-vr-on...