Marginally Interesting: It's that NIPS time of the year again

Since the annual NIPS deadline is approaching quickly (June 3), I’d like to share a piece of advice with you I’ve had to learn the hard way. I’m sure you’ve all experienced something like this in one way or the other. This is based on a talk I held at our group retreat in 2008.

The Story of a NIPS Paper

Many paper projects start out innocently enough because someone has an idea for a new method. It might be an extension of an existing method or a combination of a new method, or something entirely new.

So initially you’ll have to spend some time actually implementing the method. You play around a bit with some toy data, and everything looks very fine.

Of course, a real paper needs to have an experimental section, so you’ll start to go shopping for a “real” data set to test your method on.

More often than not, the actual data set will be from an exotic source, but you probably won’t care as you need all of the remaining time to parse the data and run experiments.

And unless you’re extremely lucky, you’ll find that your method works just as well as everything else. Moreover, if you had taken the application underlying the data set seriously, you would have probably done something completely different anyway.

The paper can take a number of ways from here, including toying with your data until you have some “significant” improvements, but this situation is hardly ideal, of course. Either your method or the experimental data becomes a moving target which can be highly frustrating.

Lesson to be Learned

So what is the lesson to be learned here? I think the main problem is that the question whether you solve a relevant problem came last. Usually, one has some vague idea why a certain extension of a method constitutes an improvement, but this idea is rarely spelled out explicitly.

So on the bottom line:

Define the problem you are trying to solve first.
Make sure that it is relevant and unsolved.
Define your goal and check it periodically.

This wouldn’t be a typical post if I wouldn’t add some comments regarding how the right tool set can help you sticking to these procedures. You should automate as many manual tasks as possible. Ideally, you should have one script to run a method on a benchmark data set, and another script which evaluates the results, probably even outputs a LaTeX table.

Machine Learning and Test Driven Development

There is a close connection here to the role of testing in agile software development. For example, in test-driven development and behavior driven development you start to write code which describes what your software is supposed to do in the form of executable tests. Starting from there, you can then develop your actual code until all of the tests succeed (at least in theory.)

Translated to machine learning, this means that you first define the problem you need to solve in terms of:

A data set which represents the type of learning problem you would consider unsolved.
A script which automatically evaluates a learning method on the data set.
An executable test which compares a method to the given baseline and checks for statistically significant improvement.

Based on this infrastructure, you can then proceed to implement and work on your own method and have made sure that you won’t lose sight of solving the actual problem.

As always, I’d like to point you to mloss.org for a large collection of open source machine learning methods which can form the baseline. See also this post for a list of sites with machine learning data sets. Finally, we’re currently working on a site very similar to mloss where you can also submit data sets which we will shortly release in an alpha stage.

Posted by Mikio L. Braun at 2010-04-29 16:50:00 +0000

Command Line Interactive Machine Learning on the JVM. Part 3: Missing Parts

A Bit of Background on "Bayes vs. Frequentists"