I always hated questions like “what is an organism” in school. What makes this kind of question hard is while you typically have quite an elaborate idea about what an organism is, summarizing all of this information in on sentence can be very hard.
For me, the question “what is machine learning” is very similar. I’ve already tried to answer that question, or to better characterize what the role of data sets are in machine learning, but I never found those definitions sufficient.
But is it really that important to be able to define what machine learning is? After all, we all know what it is, right? Well, I found that answering what machine learning is to you can actually be very helpful when it comes to deciding which new ideas to follow or planning what to do in the long run. And it also helps a lot answering the question what I do for a living. Being able to describe what you’re doing at least is better than laughing out load and then saying “well… .” (that really happens…)
Okay, so the usual definition you often hear is that machine learning is about writing programs which can learn from examples. This is about as vaguely correct and non-committal as it gets. Put differently, machine learning is about programs which can learn from examples, and solve complex problems which are hard to formalize.
So far, so good, but when you take this characterization seriously, you see that machine learning is mostly a certain approach, or conceptual framework, for solving computational problems. In particular, machine learning is not directly linked to an application area (like, I don’t know, building fast and reliable data bases or computer networks), but it is more of a meta-discipline. In this respect, it is maybe more similar to theoretical computer science, or even mathematics, where you develop theory and neglect the question what the practical implications are because you are confident that eventually, somebody will find a way to use it (After all, who would have thought that number theory would one day be so important for cryptography?)
This also means that we often find ourselves in the position of an intruder in other areas, telling people that a “learning approach” works better than the tools they have been building for the last couple of years. Those of you who on projects with a strong application focus like bioinformatics, computational chemistry, or computer security certainly know what I am talking about… .
So on the one hand, machine learning deals with problems which are hard to formalize, and concrete data sets become a substitute for formal problem definitions. On the other hand, machine learning can and is often studied in quite an abstract fashion, for example, by developing general purpose learning methods.
But is it really possible to study something abstractly where we often need concrete data sets to represent certain questions and problems? Of course, elaborate theoretical frameworks for modeling learning exist, for example, for supervised learning. But these formalizations are ultimately too abstract to capture what really characterizes the problems we wish to solve in the real world.
I personally think the answer is no, meaning that it’s not possible to do machine learning abstractly, because we need concrete applications and data sets to provide real challenges to our methods.
What this also means is that you need to compete not only with other machine learning people, but also with people who approach the problems in the classical way, by trying to write programs which solve the problem. While it is always nice (and also very impressive) if you can solve a hard task with a generic machine learning method, you need to compete with “classical” approaches as well to make sure that the “learning approach” really has an advantage.
A few months ago I created an account on twitter. Part of me justed wanted to try out the newest Web 2.0 thing everybody’s crazy about, but I also got myself convinced to perceive a certain need as I was going to a wedding without my family and thought that this way I could keep them up to date on my whereabouts.
So basically, I posted a few tweets for about one weekend in German, and that was more or less it.
The funny thing is that after that weekend I got 3 followers, most of which didn’t even speak German. By now, I have 10 followers, and I really don’t think I deserve them. So why are the people subscribing to my feed?
Well, one person I know personally, and a few seem to follow me as I stated in my profile that I’m working on machine learning, but that still leaves about 5 people, and frankly, I don’t even know how they even found my feed.
Anyway, I also haven’t really yet understood what twitter could do for me. I’m not saying that it doesn’t make sense at all. For example, I’m following Charles Nutter, one of the main guys working on jruby, and I found his tweets to be a nice way to track what he is doing and what he is working on.
In my case, however, it doesn’t really work. I’m involved in so many things that people would get seriously confused if I wrote down every little bit (writing proposal/discussing with students/thinking about world-domination (muahahah)/reviewing a paper/fixing cron jobs). I could tweet about my research, but I’m not even sure if it would be wise if I told everybody what I’m working on, because either it doesn’t work out, and then it could be kinda embarassing, or it actually works, and then I’m just giving other people ideas what to look into.
Lately, I’ve had kind of an insight: the penalty for subscribing to a low-volume twitter is quite small (apart from you loosing track of what the heck you’re subscribed to). Some people have like ten thousand subscriptions. But if most of them don’t post anything useful, everything’s fine. And if you subscribe to somebody who posts a lot but you lose interest, you can get rid of him easily. So maybe everything’s making sense.
Well, I’ll be attending this years NIPS conference
in December. An excellent opportunity to try twitter again ;)
Well, the NIPS results are out. If you don’t know it, it is one of the largest (maybe the largest) conferences in machine learning held each year in early December, and they have just sent around which papers are accepted and which are not on Saturday.
Unfortunately, none of my papers made, although one got quite close. On the other hand, I’m very glad to announce that our workshop on machine learning open source software has been accepted. This we be the second (actually third) installment: In 2005, the workshop was not included into the program, but many people found the issue important enough to come to Vancouver a day earlier and take part in a “Satellite Workshop”.
In 2006 we were accepted and actually had a very nice day in Whistler. When I noticed that I was personally enjoying the workshop, I knew that we had managed to put together a nice program. Maybe the highlight was the final discussion session with Fernando Pereira stating that there is little incentive for researchers to work on software because there is no measurable merit in doing so. Eventually, this discussion lead to a position paper and finally to a special track on machine learning software at the Journal of Machine Learning Research.
I’m looking forward to this years workshop, and hope that it will be equally interesting and productive!