- Until about ten years ago most polls used telephone samples, based on the assumption that most households had a landline telephone. So called random digit dialling could be used to overcome problems of unlisted numbers and all numbers mapped to geographical areas so a survey could be readily targeted. Today this is no longer a workable assumption, with around 40% of adults being “mobile only”, and many people having multiple phones.
- Response rates to surveys have gone down increasing the biases from certain people avoiding surveys. This could be due to a number of reasons including concerns about security and confidentiality and that people are getting inundated with spam and don’t want to participate anymore.
Technology has led to new survey methods such as internet panels and robocalls. Little is known about the accuracy of these methods.
Cost competition will tend to drive polling companies to use whatever is currently the most economical method of carrying out the survey. It naturally drives the companies to use similar methodologies and hence have similar biases.
Was there “herding”?
The suggestion of the polls being too similar in their predictions, a phenomenon often termed herding, is particularly serious. It suggests that some companies have purposely skewed their results to be more similar to the rest. For example, Lonergan Research publicly admitted they did not publish their results, as their prediction of a Liberal victory was significantly different from polls taken by other market researchers.
The evidence for herding is based upon the degree of variation expected when a survey is conducted using simple random sampling and when simple estimation methods are used in its analysis. Indeed, there is some reason for believing that the polling companies do use simple sampling and simple estimates since the published “margin for error” for many surveys appears to be based upon this.
Over the years, better methods of sampling and better estimates that have been developed, including:
- Stratified sampling - Where the population is partitioned into sub-populations and different sampling rates applied to each is a well-known method that can improve estimates.
For example, it might be reasonable to assume that party allegiance varies from state to state, so treating the states as strata and then having different sampling rates in each state may improve sampling efficiency.
- Scaling - Simple estimates might calculate the proportion in the sample and simply scale that up to the population (or the proportion in each stratum sample in the case of stratified sampling). However more sophisticated scaling might be based upon respondents past voting behaviour and the population results at the last election.
We do not know whether the polls did not use such methods to improve repeatability and hence the purported herding cannot be said to be proved. It is unclear what methods the polling companies used at the recent election, and without greater public disclosure on the methods employed, it is difficult to assess in detail where things may have gone awry.
What does this tell us?
The polling companies all do general market and social research, with polling just being the most visible of their work. There is rightly concern that the flaws shown up in their polling affect their other surveys as well, but without the corresponding election truth to point out the problems.
Anyone relying upon survey data should consider how the data was collected. The way the data is to be used will determine what aspects of the data collection are most important.
Data Analysis Australia has always taken a conservative approach to surveys, aiming to achieve good sampling and good estimation. This has led to our work focusing on projects where quality is paramount, such as in legal matters. This does not mean we do not use the latest technology, rather we do not let technology dominate to the exclusion of quality.