Jay Kreps, a data scientist on LinkedIn’s social network analysis team, posted this tweet which resonated quite much within the Twitter community (133 retweets and 64 favorites so far):
Trick for productionizing research: read current 3-5 pubs and note the stupid simple thing they all claim to beat, implement that.— Jay Kreps (@jaykreps) July 3, 2012
And the sad thing is, I kind of agree with it, too. There is a little piece of wisdom in the ML community which says that the simple methods often work best. It depends on what different people consider “simple”, but there are enough examples where k-nearest neighbor beats SVMs, linear methods beat more complex ones, or stochastic gradient descent outperforms more fancy optimization methods.
I think the main reason for this divide between science and industry is that both areas have their own, very specific, cost functions to measure progress leading to quite different main activities. In a nutshell: academia explores, industry builds.
The two main driving forces behind scientific progress are “advancing the state-of-the-art” and “novelty”. In my experience, these criteria are much higher on the list than “Does it solve a relevant problem?” And it’s probably also not necessary to be relevant (yet). The standard argument here is number theory which eventually became the foundation for cryptography without which business on the Internet wouldn’t work as it does right now, so we never know, right?
Now if the main forces are improvements over previous work and novelty, what kind of dynamics do we end up with? To me, it increasingly seems like research is about covering as much ground as possible. It’s like performing stochastic gradient ascent with rejection sampling based on the lack of novelty (that is, closeness to existing work). People are constantly looking for ways to find something new, which hopefully opens up new areas to explore.
In the industry, on the other hand, the cost function is different. In the end, it’s all about making money (as we all know). And to make money, you have to create value, in other words, you have to build something.
Of course, exploration is important in the industry as well (and there exist research units within industry whose role is to achieve exactly that), but once you have some interesting new piece of technology, you have to actually first build a product and then a business on that.
Compared to the industry, science also stays on a more abstract level, For example, for machine learning you usually have to describe your algorithm mathematically and implemented it in some preliminary form to run batch experiments, but it is ok to only report the results without publishing your code, too. If you really want to, you can go beyond this kind of research software and make your code usable and release it (and we’ve set up mloss and a special track at JMLR to help you get credit for that), but it’s not strictly necessary.
Of course, both approaches are fully justified and serve different purposes. But I personally think that science is often missing important insights by staying on that abstract level. Only if you really try you ideas in the wild will you see whether your assumptions have been correct or not. The real-world is also an indispensible source for ideas and, of course, gives you a measure of relevance to guide your explorations on a larger scale.
So when we’re talking about relevance and impact of machine learning, I think these issues are also partly due to systemic differences between what kind of work is considered valuable in different communities. I’m not sure there is an easy solution to this. You can personally try to do both, explore and build (and I think there are enough people who do), but that will always mean that you will sacrifice time spent on increasing your score in the other metric.
Thanks to Paul Bünau, Andreas Ziehe, and Martin Scholl for listening to my rambling about this topic over lunch.
Posted by Mikio L. Braun at 2012-07-17 21:35:00 +0200blog comments powered by Disqus