A research paper published this past week suggests that, in an analysis of 435 U.S. House of Representative elections in 2010, just looking at the frequency a candidate’s name was mentioned on Twitter could be used to predict the election outcome.
But here’s the crazy thing — it didn’t matter what was said about the candidate — positive, negative or neutral. What mattered was how much the candidate’s name was tweeted by supporters, foes, or just randomly.
The researchers could predict the outcome in 404 out of 435 races — an astounding prediction rate of 92.8 percent.
But there’s a catch… and it’s a big one.
What the paper and its co-author Indiana University sociologist Fabio Rojas failed to take into account (or at least acknowledge in all their press on the topic) was simple statistical probability. In 19 of the past 23 elections, 90 percent of incumbents ended up winning re-election.
So the researchers improved on our existing prediction base by 2.8 percent. Not exactly newsworthy when put into that sort of perspective.
Incumbents — with their existing name recognition in their districts and campaign fundraising machine well-oiled — usually have little difficulty in getting re-elected.
I’m not the first to come up with this alternative explanation for the researchers’ results, as Stuart Rothenberg noted:
Few reputable political scientists will accept Rojas’s assertion that, “New research in computer science, sociology and political science shows that data extracted from social media platforms yield accurate measurement of public opinion.”
“No one I know is saying anything close to this,” Michael S. Lewis-Beck, the F. Wendell Miller Distinguished Professor of Political Science at the University of Iowa and a well-known authority on both American politics and election forecasting, wrote in an email to me this week.
It’s kind of disturbing that Rojas, a social scientist, doesn’t appear to even discuss alternative explanations for his findings in his article (linked below). Scientists have the responsibility to present their findings to the public in a way that also discloses the limitations of their research — which includes mentioning why these preliminary results may not hold up when tested in more hotly-contested or non-national political races.
The real test of this prediction system will be when it’s used in races where incumbency isn’t such a reliable predictor. Perhaps the Senate, or more local races (such as governorships). This may good research, but where’s the temperance in the discussion of the results? Where’s the sanity check about the rationality of the findings? Where’s the follow-up study demonstrating these results actually are generalizable to other elections?
Until then, Twitter might be a quick way to gauge the temperature of an election. Or it may not.
But I think the last thing Twitter wants to be known for is as a social media outlet where the content of what you post doesn’t really matter.
Read Fabio Rojas’ article: How Twitter can help predict an election
Read Stu Rothenberg’s rebuttal: Twitter Can’t Yet Predict Elections