Marginally Interesting: What is Machine Learning? Revisited

I always hated questions like “what is an organism” in school. What makes this kind of question hard is while you typically have quite an elaborate idea about what an organism is, summarizing all of this information in on sentence can be very hard.

For me, the question “what is machine learning” is very similar. I’ve already tried to answer that question, or to better characterize what the role of data sets are in machine learning, but I never found those definitions sufficient.

But is it really that important to be able to define what machine learning is? After all, we all know what it is, right? Well, I found that answering what machine learning is to you can actually be very helpful when it comes to deciding which new ideas to follow or planning what to do in the long run. And it also helps a lot answering the question what I do for a living. Being able to describe what you’re doing at least is better than laughing out load and then saying “well… .” (that really happens…)

Okay, so the usual definition you often hear is that machine learning is about writing programs which can learn from examples. This is about as vaguely correct and non-committal as it gets. Put differently, machine learning is about programs which can learn from examples, and solve complex problems which are hard to formalize.

So far, so good, but when you take this characterization seriously, you see that machine learning is mostly a certain approach, or conceptual framework, for solving computational problems. In particular, machine learning is not directly linked to an application area (like, I don’t know, building fast and reliable data bases or computer networks), but it is more of a meta-discipline. In this respect, it is maybe more similar to theoretical computer science, or even mathematics, where you develop theory and neglect the question what the practical implications are because you are confident that eventually, somebody will find a way to use it (After all, who would have thought that number theory would one day be so important for cryptography?)

This also means that we often find ourselves in the position of an intruder in other areas, telling people that a “learning approach” works better than the tools they have been building for the last couple of years. Those of you who on projects with a strong application focus like bioinformatics, computational chemistry, or computer security certainly know what I am talking about… .

So on the one hand, machine learning deals with problems which are hard to formalize, and concrete data sets become a substitute for formal problem definitions. On the other hand, machine learning can and is often studied in quite an abstract fashion, for example, by developing general purpose learning methods.

But is it really possible to study something abstractly where we often need concrete data sets to represent certain questions and problems? Of course, elaborate theoretical frameworks for modeling learning exist, for example, for supervised learning. But these formalizations are ultimately too abstract to capture what really characterizes the problems we wish to solve in the real world.

I personally think the answer is no, meaning that it’s not possible to do machine learning abstractly, because we need concrete applications and data sets to provide real challenges to our methods.

What this also means is that you need to compete not only with other machine learning people, but also with people who approach the problems in the classical way, by trying to write programs which solve the problem. While it is always nice (and also very impressive) if you can solve a hard task with a generic machine learning method, you need to compete with “classical” approaches as well to make sure that the “learning approach” really has an advantage.

Posted by at 2008-10-20 15:55:00 +0000

Why are people following me on twitter?

Matrices, JNI, DirectBuffers, and Number Crunching in Java