On Monday, May 31, 2010, the German president resigned from his office totally unexpectedly. Immediately, a discussion on who should be nominated as a potential candidate started. Initially, many people pointed to Ursula von der Leyen, but again quite unexpectedly, Christian Wulff became the coalition's candidate. In this post (which is written in German due to the locality of the events), I discuss how these events are reflected in retweet trends on twitter. We have been collecting retweets for our site twimpact.com for while now and this seemed like a perfect example to study the relationship between real-world events and Twitter.
Seit einiger Zeit betreiben wir die Webseite twimpact.com, die Retweets aus Twitter abgreift und daraus Retweet-Trends berechnet. Darauf basierend, wollten wir einmal schauen, in welchem Maß sich die Ereignisse nach dem Rücktritt von Horst Köhler auf Twitter widerspiegeln.
Zur Erinnerung: Am Montag, den 31. Mai 2010, trat völlig unerwartet Horst Köhler von seinem Amt als Bundespräsident zurück. Dem waren Äußerungen über Auslandseinsätze der Bundeswehr vorangegangen, die einigen Unmut auslösten. Dennoch hatte niemand damit gerechnet, dass Horst Köhler dies zum Anlass nehmen würde, sein Amt niederzulegen.
Nach dem plötzliche Rücktritt wurde schnell die Frage nach dem Nachfolger gestellt. In den Medien wurde relativ bald schon Ursula von der Leyen als mögliche Präsidentschaftskandidatin gehandelt. Doch die Koalition nominierte schließlich Christian Wulff am Nachmittag des 3. Juni. Als Reaktion brachte die Opposition schließlich Joachim Gauck ins Gespräch.
Später wurde bekannt, dass Wulff bereits Mittwochmittag Merkel zugesagt hatte, doch angeblich wurde nicht einmal von der Leyen darüber aufgeklärt.
Wir haben die Retweethäufigkeit für bestimmte Schlüsselwörter geplottet: Einmal "Merkel" und "Köhler" und dann "Leyen", "Wulff" und "Gauck". Wenn man auf einzelne Datenpunkte fährt, kann man die Top-Retweets für den entsprechenden Zeitraum sehen.
Man sieht, dass die Nachricht vom Rücktritt völlig unerwartet kam. Interessant ist auch, dass zunächst die eigentlichen Meldungen der Nachrichtenagenturen vorherrschen, aber bald auch Kommentare und Witze über den Rücktritt auftauchen.
Ein ähnliches Bild ergibt sich auch dem folgenden Plot, der die drei Hauptkandidaten vergleicht. Wulff spielt zunächst keine größere Rolle, und die Retweets beschränken sich hauptsächlich auf von der Leyen beschränkten. Erst nach der offiziellen Bekanntgabe taucht Wulff auf, und von der Leyen verschwindet fast gleichzeitig. Auch interessant ist, wie Gauck als Reaktion ein paar Stunden nach Wulff auftaucht.
Wenn man sich die Retweets anschaut, dann wird auch deutlich, dass die Internetgemeinschaft nicht gut auf von der Leyen zu sprechen ist. Ihre Vorschläge zur Internetzugangsbeschränkung sind nicht gut angekommen und immer noch nicht vergessen.
I have to admit that I always found the Bayesians vs. Frequentist divide quite silly, at least from a pragmatist point of view. Instead of taking an alternative viewpoint as an inspiration of rethinking one’s own approach, it seems like modern statistics is more or less stuck, probably with the Bayesians being the more stubborn of the two. At least in my own experience, Bayesians have been more likely to discredit the Frequentist approach totally than the other way round.
Recently I came across two interesting papers that discuss aspects of the history of modern statistics and which I found pretty interesting for understanding how we arrived in the current situation. The first one is Sandy Zabell’s R. A. Fisher on the History of Inverse Probability, and the other is When Did Bayesian Inference Become “Bayesian” by Stephen Fienberg.
I have always silently assumed that the Bayesian and frequentist point of view have somehow be around from the very beginning, and neither of the two approaches has managed to become the predominant one. I found it quite plausible that such things happen, probably because the unifying “true” point of view hasn’t been discovered yet.
However, it turns out that actually the Bayesian point of view has been the predominant one till the 1920s. In fact, the first ideas on statistical inference attributed to Laplace and Bayes have been Bayesian in nature. The term “Bayesian” didn’t even exist back then, instead the application of Bayes theory was called “inversion of probabilities” in analogy to the inversion of a function.
Then, at the beginning of the 20th century, the frequentist point of view emerged, in particular through the works of R. A. Fisher, Neymann, Pearson, and others who laid the foundation of modern classical statistics. The main point of critique was that the requirement of Bayesian inference to specify a prior distribution introduces a subjective element, meaning that the inference depends not only on the data but also on your own assumptions. In Fisher’s 1922 paper On the Mathematical Foundations of Theoretical Statistics, Fisher proposes alternative criteria for sound statistical inference like unbiasedness, consistency and efficiency.
These works had tremendous impact back then, pushing Bayesian inference in the background as everyone jumped the new frequentist bandwagon. Only in the 1950s did the Bayesian approach reemerge and slowly attract researchers again.
In summary, my understanding that both approaches had always been equal rivals has been wrong. Instead, the Bayesian approach was almost eclipsed by the frequentist’s until it could recover only 50-60 years ago. This means that your professor might have studied with someone who can still remember those days when Bayesian inference tried to make a comeback and get its share of the cake.
Or put differently, the reason that frequentists usually don’t care by which label they go by might be that, historically, frequentists just conquered a whole field when they arrived, while Bayesians are still under the somewhat traumatic impression of being eclipsed by an alternative approach.
With which I’m not implying that this is still the case. On the contrary, I think Bayesian and frequentist approaches have long entered the state of being on equal footing, at least in the area of machine learning. Therefore, I think it is probably time that both sides start to realize that they actually have more in common than it seems. That is something I’d like to cover in another post.
Since the annual NIPS deadline is approaching quickly (June 3), I’d like to share a piece of advice with you I’ve had to learn the hard way. I’m sure you’ve all experienced something like this in one way or the other. This is based on a talk I held at our group retreat in 2008.
Many paper projects start out innocently enough because someone has an idea for a new method. It might be an extension of an existing method or a combination of a new method, or something entirely new.
So initially you’ll have to spend some time actually implementing the method. You play around a bit with some toy data, and everything looks very fine.
Of course, a real paper needs to have an experimental section, so you’ll start to go shopping for a “real” data set to test your method on.
More often than not, the actual data set will be from an exotic source, but you probably won’t care as you need all of the remaining time to parse the data and run experiments.
And unless you’re extremely lucky, you’ll find that your method works just as well as everything else. Moreover, if you had taken the application underlying the data set seriously, you would have probably done something completely different anyway.
The paper can take a number of ways from here, including toying with your data until you have some “significant” improvements, but this situation is hardly ideal, of course. Either your method or the experimental data becomes a moving target which can be highly frustrating.
So what is the lesson to be learned here? I think the main problem is that the question whether you solve a relevant problem came last. Usually, one has some vague idea why a certain extension of a method constitutes an improvement, but this idea is rarely spelled out explicitly.
So on the bottom line:
This wouldn’t be a typical post if I wouldn’t add some comments regarding how the right tool set can help you sticking to these procedures. You should automate as many manual tasks as possible. Ideally, you should have one script to run a method on a benchmark data set, and another script which evaluates the results, probably even outputs a LaTeX table.
There is a close connection here to the role of testing in agile software development. For example, in test-driven development and behavior driven development you start to write code which describes what your software is supposed to do in the form of executable tests. Starting from there, you can then develop your actual code until all of the tests succeed (at least in theory.)
Translated to machine learning, this means that you first define the problem you need to solve in terms of:
Based on this infrastructure, you can then proceed to implement and work on your own method and have made sure that you won’t lose sight of solving the actual problem.
As always, I’d like to point you to mloss.org for a large collection of open source machine learning methods which can form the baseline. See also this post for a list of sites with machine learning data sets. Finally, we’re currently working on a site very similar to mloss where you can also submit data sets which we will shortly release in an alpha stage.