Machine Learning: Beyond Prediction Accuracy
Okay, let's start this year with something controversial. The other day I was meeting with some old friends I met at university and we were discussing about machine learning. They were a bit surprised when I disclosed that I became somewhat disillusioned with machine learning because I felt there was any resemblance of intelligence missing in even the most successful methods.
The point I tried to make was that the "mind" of machine learning methods is just like a calculator. If you take a method like the support vector machine (and in extension practically all kernel methods), you will see that the machine does in no way "understand" the data it is dealing with. Instead, millions of low-level features are cleverly combined linearly to achieve predictions which have remarkable accuracy.
On the contrary, if we human learn to predict something it seems (and I know that there is a risk with introspection when it comes to mental processes) that we also begin to understand the data, decompose it into parts and understand the relations between these parts. This allows us to reason about the parts, and also prepares us to transfer our knowledge to new tasks. Leaving prediction tasks aside, we can even just take in a whole set of data and understand its structure, and what it is.
I went on to explain that the same is even true of more complex, structured models like neural networks or graphical models. In neural networks, it is still unclear whether it really achieves some internal representation having semantic content (but see our NIPS paper where we try to understand better how representations change within a neural network). As for graphical models, the way I see it, they are very good at extracting a certain type of structure from the data, but this structure must be built into the structure of the network.
Actually, my complaint went further than this. Not only are even the most successful methods lacking any sign of real intelligence and understanding of the data, but the community as a whole also seems to be content to just keep following that part.
Almost every time I have tried to argue that we should think about methods which are able to autonomously "understand" the data they are working with, people say one of these two things:
-
Why would that get me better prediction accuracy?
-
The Airplane Analogy: Airplanes are also much different from birds with their stiff wings and engines. So why should learning machines be modeled after the human mind?
People seem to be very much focused on prediction accuracy. And while there are many applications where high prediction accuracy is exactly what you want to achieve, from a more scientific point of view, this represents an extremely reductionistic view of human intelligence and the human mind.
Concerning the second argument, people overlook the fact that while airplanes and birds are quite differently designed, they nevertheless are based on the same principles of aerodynamics. And as far as I see it, SVMs and the human mind are based on entirely different principles.
Learning Tasks Beyond Prediction Accuracy
At this points, my friends said something which made me think: If everyone is so focused on benchmarks, and benchmarks apparently do not require us to build machines which really understand the data, it is probably time to look for learning tasks which do.
I think this is a very good question. Part of the problem is of course that we have a very limited understanding of what the human mind actually does. Or put differently, being able to formally define what the problem is probably already more than half of the answer.
On the other hand, two problems come to mind which seem to be good candidates. Solutions exist, but they don't really work well in all the cases, probably due to a lack of "real" intelligence: computer vision (mainly object recognition), and machine translation
If you look at current benchmark data sets in computer vision like the Pascal VOC challenges, you see that methods already achieve quite good predictions using a combination of biologically motivated features and "unintelligent" methods like SVMs. However, if you look closer, you also see that there are some categories which are inherently hard, and much more difficult to solve than other categories. If you look at the classification results, you see that you can detect airplanes with 88% average precision, but bottles only with 44%, or dining tables with 57%. I'd say that the reason is that you cannot just "learn" each possible representation of a dining table with respect to low-level features, but you need to develop a better "understanding" of the world as seen in two-dimensional images.
I don't have similar numbers for machine translation, but taking Google translate as an example (at least I think it is safe to say that they are using the most extensive training set for their translations), then you see that results are often quite impressive, but only in the sense that you can still understand what the text is about by parsing the result, which is full of errors on every level, as a human.
These are two examples of problems which are potentially strong enough require truly "intelligent" algorithms. And I still have no idea what that might exactly mean, but it's probably neither "symbolic" nor "connectionist" nor Bayesian, some mix of automatic feature generation and EM algorithm, probably.
Note that I didn't include the Turing test. I think it is OK (and probably even preferable) if the task is defined in a formal way (and without refering back to humans for that matter).
In any case, I enjoyed talking to "outsiders" quite much. They are unaware of all the social implications of asking certain kinds of questions (What are the hot topics everyone is working on right now? Will this make the headlines at the next conference? What are senior researchers thinking about these matters), and every once in a while we should also step back and ask ourselves what the questions are we're really interested in, independently of whether there is any support in the community for it or not.
Comments (11)
I doubt we'll ever get true understanding using algorithms running on binary machines, because no matter how many CPUs you have, it's still not a brain filled with neurons & neurotransmitters. However, AI with real understanding may be possible using actual brain matter, see http://www.sciencedaily.com.... Apparently a partial rat brain can be trained to fly, at least in a simulator. If that's true, then it seems to me that machine learning algorithms wouldn't be nearly as important as sensory input & output + well designed training processes.
Hi Jacob, thanks for your comment.
I think what you can do even with normal computers is trying to model the "mechanics" of what happens in a real brain. In that view the brain is "just" a different computational platform to implement those mechanics in. Of course, it might be that the problems involved are intrinsically unfit for sequential computation as done in normal computers (as opposed to massively parallel computations in the brain), but I don't see a reason per se for that.
I'm actually not that convinced that just remodelling a real brain in software (or even hardware) will give us a lot of insights. I'm interested in understanding how the human mind accomplishes what it does, not just rebuilding it. Or put differently, even if we have built some artifact which reproduces intelligent behavior just by copying the structure of the brain, we still don't know exactly how it works.
The main issue there is first understanding brain mechanics at the lowest level. What happens during problem solving, creative writing, intuitive insight, etc. I'm sure these processes are hugely complicated at the molecular level. Even if we did understand that level of detail and could re-create it, I think you're right that we still wouldn't understand how that detail translates into mental thoughts & processes. But I'm sure that attempting to model it could yield enormous insight.
Personally, I don't believe mind/consciousness is entirely physical - the brain/body may just be a medium connecting mind & reality. If that's actually true, then understanding the mind through observing the brain would be mostly futile, because you'd only be observing a reflection, not the true thing. But that's just my own metaphysical belief :)
Concerning the mind/consciousness divide, I think one has to assume that the mind is entirely generated by the brain as a working hypothesis... .
I agree that understanding the brain at a molecular level will give enormous insights. But I also believe that you can start to think on the next higher level and try to discover the "mechanics of the mind", or something like it, without going through the actual neuronal activity. As always, there are many different ways of attacking the problem.
I know this is a fairly old post, but I enjoyed it enough to drop by in the comment section.
To me it sounds like you're describing the quest for artificial general intelligence (AGI). It's a term I've learned only recently thanks to this video of Demis Hassabis at the Singularity Conference: http://vimeo.com/17513841 .
I'm currently in an application/admissions cycle for graduate school with the goal of hopefully doing research on AGI. One of my informal mentors posed a question on the relationship of machine learning and AGI that has left me pondering for quite a while about my research desires. Basically, if machine learning achieves, via "brute force" of data and computation, the results you desire from an AGI, why bother with AGI anyway? An AGI may do object recognition elegantly, but if ML techniques combined with enormous image data sets gets results just as well and can be done now, why bother? One immediate answer may be reduced computational needs of an elegant AGI solution, but would that be a moot point with the improvement of hardware and software? Of course, these questions are posed from the practical standpoint of application. For me they don't negate the desire to learn how the brain works, but I'll be curious to hear your take on it.
I think machine learning and artificial intelligence are actually quite orthogonal concepts. The main idea behind machine learning is that you can get a computer to solve complex prediction or data analysis tasks by extracting the relevant information from huge heaps of example data. The alternative is to fully understand the problem to program the solution directly (the more common approach in computer science).
So you can approach AI also in the usual way, trying to understand whatever it is that AI does and then writing the program to solve it. I think in a way a lot of symbolic AI has followed this approach. An ML approach to symbolic AI, on the other hand, would take a lot of examples, a model flexible enough to represent the solution and then "learn" the necessary parameters from example data.
You can't beat a complex NN trained for a lifetime (brain), that includes models from thousands of generations before (genes) with few examples and a regression function.
Hi Asenz,
no offense, but just saying "it can't be done" is not a very scientific approach to this question. This is a bit like people saying "don't work on this problem because others have been working on this for decades" (something people really say practically every time you try to bring a new approach to some problem which has existed for some time).
Let's stick with the latter argument and come back to what you said. So even if people have been working on stuff for decades, every year new people arrive after a few years of training to start working in this field. Often, PostDocs also change fields after they got their Ph.D., getting up to speed in a few months or years. The point is, you can learn this stuff much faster than it takes to explore it.
So it is right that it took ages to get where we are right now, that there is a lifetime of experience, and thousands of generations, but that doesn't mean that it will take as long to understand the principles behind this. Even if it were true that you cannot just rebuild the thing, scientific curiosity demands that you understand why, what the limitations are, and why you just can't build it.
So there are a lot of interesting questions and insights waiting for us to be discovered, even if we will ultimately fail, or are still very far from solving these problems. But just saying "well, we'll never be able to do it because it's just so complex" is not a valid argument IMHO.
-M
Thank you for the answer Mikio. Let me explain myself in more detail. You've compared today ML techniques as more similar to that of a calculator than an intelligent brain. A calculator is just that what it is, a very unintelligent brain (you get my point). Any ML method that strives to compete with an intelligent human brain would need to incorporate the capabilities that a brain has and then add some more to that. My assumption is that we already have the methods (ANN) and even advanced at that (SVM/R) but we lack the technology to decode information already present, store it and process it. I don't believe there is more to hard AI than the advanced technology needed to implement it.
I don't see why you can't. But it likely won't be like any kind of regression people are familiar with. The question is, if you were to find this function via some method, could you say that there is something in the method that "understands"?
If your method was sufficiently brainlike, then perhaps you could convince yourself that there is some "understanding." We could probably relate to what the method does, because as humans, we know what it's "like" to understand. But what if there's a method that has absolutely nothing to do with how the brain works? In that case, what would "understanding" even mean?
I've wondered about how we can really define intelligence in a useful way at all. The only examples we have of intelligence are things with brains. But that is a rather trivial definition. What is it *about* brains, computationally, that gives rise to intelligence? Are there other systems that have similar properties that we can use to generalize the computational principles of intelligence? I can't think of any that are close enough to make useful generalizations. So we seem to be stuck with this single class of examples of intelligence. If we accept this narrow view of intelligence, then we need to understand the brain first to create it. Then, perhaps, we can identify the general principles that make the brain work and start noticing those principles at work in unexpected places.