Since the annual NIPS deadline is approaching quickly (June 3), I’d like to share a piece of advice with you I’ve had to learn the hard way. I’m sure you’ve all experienced something like this in one way or the other. This is based on a talk I held at our group retreat in 2008.
Many paper projects start out innocently enough because someone has an idea for a new method. It might be an extension of an existing method or a combination of a new method, or something entirely new.
So initially you’ll have to spend some time actually implementing the method. You play around a bit with some toy data, and everything looks very fine.
Of course, a real paper needs to have an experimental section, so you’ll start to go shopping for a “real” data set to test your method on.
More often than not, the actual data set will be from an exotic source, but you probably won’t care as you need all of the remaining time to parse the data and run experiments.
And unless you’re extremely lucky, you’ll find that your method works just as well as everything else. Moreover, if you had taken the application underlying the data set seriously, you would have probably done something completely different anyway.
The paper can take a number of ways from here, including toying with your data until you have some “significant” improvements, but this situation is hardly ideal, of course. Either your method or the experimental data becomes a moving target which can be highly frustrating.
So what is the lesson to be learned here? I think the main problem is that the question whether you solve a relevant problem came last. Usually, one has some vague idea why a certain extension of a method constitutes an improvement, but this idea is rarely spelled out explicitly.
So on the bottom line:
This wouldn’t be a typical post if I wouldn’t add some comments regarding how the right tool set can help you sticking to these procedures. You should automate as many manual tasks as possible. Ideally, you should have one script to run a method on a benchmark data set, and another script which evaluates the results, probably even outputs a LaTeX table.
There is a close connection here to the role of testing in agile software development. For example, in test-driven development and behavior driven development you start to write code which describes what your software is supposed to do in the form of executable tests. Starting from there, you can then develop your actual code until all of the tests succeed (at least in theory.)
Translated to machine learning, this means that you first define the problem you need to solve in terms of:
Based on this infrastructure, you can then proceed to implement and work on your own method and have made sure that you won’t lose sight of solving the actual problem.
As always, I’d like to point you to mloss.org for a large collection of open source machine learning methods which can form the baseline. See also this post for a list of sites with machine learning data sets. Finally, we’re currently working on a site very similar to mloss where you can also submit data sets which we will shortly release in an alpha stage.
Posted by Mikio L. Braun at 2010-04-29 16:50:00 +0000