As the drama surrounding the US election starts to wind down, attention turns to the pre-election forecasts. This became a point of interest as the results started to come in from every state. So, did the election polls really get it wrong, again?
Who forecasts the US election?
When it comes to US political forecasting, there are two main models: The FiveThirtyEight model lead by Nate Silver and us POTUS model lead by Andrew Gelman and Elliot Morris for The Economist. FiveThirtyEight predicted a Biden win with 89% probability and POTUS has a Biden win at 97%.
I’d never followed a US election before. For me, watching at home in Australia, I never realised everything that comes into play with a presidential election in America. Between the electoral college voting system and the state-by-state difference in vote counting, it felt more like a media spectacle than a political movement. As the initial states started reporting their results, a Biden win seemed almost non-existent. Yet, at the end of it all, Biden came through with the help of postal votes.
How can we apply data science to the US election?
It’s easy to use a pass/fail system for forecasters who either predicted correctly or incorrectly by using a model of balance of probability. For anyone interested in learning more about interpreting probability and understanding uncertainty in forecasts, look up Andrew Gelman. His blog is a masterclass in forecast calibration and statistical reasoning using the Bayesian perspective.
Are election forecasts reliable?
Both FiveThirtyEight and The Economist model’s post-election analysis reveals something we all already know but choose to ignore; forecasts are biased. This was the same for the 2016 election.
The post-analysis of results is crucial for future forecasting. Although, industry applications of data science are often glossed over and models are not only tested out-of-sample but tested from the most fundamental elements, including:
- model specification
- data input; and
- interpretability by the end-user.
Final thoughts
In my experience, it is the reviewing of the three fundamental elements above that separates models from one another. Those that don’t assess the fundamentals and those that do. Models that test against the fundamentals are models that can be built upon – a living forecast if you will. Living forecasts are built, re-built, scrapped and changed again, the results are then presented in a way that people understand and therefore care about. It will be interesting to see what forecasters learn, if anything, from the 2020 election. Will future elections use living forecast models to predict a rather unbiased and accurate winner? We will have to wait another four years to find out.
James is our Principal Data Scientist working on Modo25’s technology platform, BOSCO™. Want to find out how you stack up against your competitors, or, plan and predict where to spend your marketing budget – ask BOSCO™.