In a previous post, I discussed the possibility of forecasting election results through social media, particularly in the context of countries with large surface and populations such as Mexico.
A couple of weeks have gone by after the election, and the dust has settled a little (although allegations of fraud and demands for the invalidation of the election results are still pending a court ruling that will come on September 6 at the latest). In the meantime, here is some preliminary data:
Here we can see that most polls over-estimated the difference between the first and second place by a factor of nearly double the difference according to the official results. The only traditional pollster to come close to the result was María de las Heras, but her survey was commissioned by a marginal media outlet, whereas the others were used- according to critics- as a way to bombard voters with the perception of an inevitable victory for the candidate of the former state party PRI.
Indeed, if we do an average of the last polling exercises by these 8 traditional pollsters, we see a difference of 14 points between the 1st and 2nd place. Now let’s take a look at the results from the Urna Abierta Project, which gathered a sample of more than 40,000 participants which was then processed to get a representative sample according to gender, region, income level and age and including compensations for phenomena such as corporatist vote and election turnout:
The difference on this poll is of less than one percent, but here the winning candidate is the one from the left-wing PRD. Both candidates would get about 35% of the vote.
Now let’s take a look at the official result:
The official tally gives the PRI candidate a victory with 38.21 percent of the vote, followed by the leftist candidate with 31.59% of the vote. There is a suspiciously big difference between what nearly all mainstream pollsters told people and the actual election result. How could they get it so wrong? Unfortunately most of them do not share their methodology and their errors cannot be investigated.
As a benchmark, on average, the last seven polls conducted before the French presidential election unanimously predicted the victory of François Hollande over Nicolas Sarkozy, although they over-estimated the difference betweem them by 2%. Compare that to the 14% in the Mexican case and talk about a margin of error!
What we can ask is, why did the Facebook poll fail to predict the winner and the difference between 1st and second place? I have e-mailed the academic behind the project for his take this failure and will update this post if he replies.
What is curious is that almost everyone managed to get the voting intention for the third place right, which makes us wonder whether political sympathies or economic interests played an obtrusive role in the presentation of polling results to the public.