The Problem with Polls

For several years before the recent election the political polls consistently suggested that Labor had greater support than the Coalition of Liberal and National. Results of polling were used as one reason for the replacement of Tony Abbott as Prime Minister and subsequently undermined his replacement Malcom Turnbull. Polls made a Labor victory in the 2019 election seem certain, so much so that Sportsbet was paying out for a Labor win before the election itself.

The election results told us that the polls – virtually all of them – had got it consistently wrong. The error was typically about 2%, but the consistency of this discrepancy was a surprise for many. Concern was also raised that the variation between polls was less than should have been expected.

How did this happen?

Polls are a form of survey, where a modest number of people - typically from hundreds to several thousand – are asked about their voting intentions. If the sample of voters is properly selected, statistical theory tells us that certain accuracy can be achieved. The key to this is that the sample needs to be representative of the wider population.

The best way to achieve a sample representative of the population is through a random sample, where the probability of any individual in the population being included in the sample is known. However, in practice carrying out a survey must mix the theoretical requirements with pragmatism and consideration of costs. Several factors were at play in the May 2019 Federal Election.

Until about ten years ago most polls used telephone samples, based on the assumption that most households had a landline telephone. So called random digit dialling could be used to overcome problems of unlisted numbers and all numbers mapped to geographical areas so a survey could be readily targeted. Today this is no longer a workable assumption, with around 40% of adults being “mobile only”, and many people having multiple phones.
Response rates to surveys have gone down increasing the biases from certain people avoiding surveys. This could be due to a number of reasons including concerns about security and confidentiality and that people are getting inundated with spam and don’t want to participate anymore.

Technology has led to new survey methods such as internet panels and robocalls. Little is known about the accuracy of these methods.

Cost competition will tend to drive polling companies to use whatever is currently the most economical method of carrying out the survey. It naturally drives the companies to use similar methodologies and hence have similar biases.

Was there “herding”?

The suggestion of the polls being too similar in their predictions, a phenomenon often termed herding, is particularly serious. It suggests that some companies have purposely skewed their results to be more similar to the rest. For example, Lonergan Research publicly admitted they did not publish their results, as their prediction of a Liberal victory was significantly different from polls taken by other market researchers.

The evidence for herding is based upon the degree of variation expected when a survey is conducted using simple random sampling and when simple estimation methods are used in its analysis. Indeed, there is some reason for believing that the polling companies do use simple sampling and simple estimates since the published “margin for error” for many surveys appears to be based upon this.

Over the years, better methods of sampling and better estimates that have been developed, including:

Stratified sampling - Where the population is partitioned into sub-populations and different sampling rates applied to each is a well-known method that can improve estimates.

For example, it might be reasonable to assume that party allegiance varies from state to state, so treating the states as strata and then having different sampling rates in each state may improve sampling efficiency.

Scaling - Simple estimates might calculate the proportion in the sample and simply scale that up to the population (or the proportion in each stratum sample in the case of stratified sampling). However more sophisticated scaling might be based upon respondents past voting behaviour and the population results at the last election.

We do not know whether the polls did not use such methods to improve repeatability and hence the purported herding cannot be said to be proved. It is unclear what methods the polling companies used at the recent election, and without greater public disclosure on the methods employed, it is difficult to assess in detail where things may have gone awry.

What does this tell us?

The polling companies all do general market and social research, with polling just being the most visible of their work. There is rightly concern that the flaws shown up in their polling affect their other surveys as well, but without the corresponding election truth to point out the problems.

Anyone relying upon survey data should consider how the data was collected. The way the data is to be used will determine what aspects of the data collection are most important.

Data Analysis Australia has always taken a conservative approach to surveys, aiming to achieve good sampling and good estimation. This has led to our work focusing on projects where quality is paramount, such as in legal matters. This does not mean we do not use the latest technology, rather we do not let technology dominate to the exclusion of quality.

August 2019