At Data Analysis Australia, one of the most common queries we receive from current and prospective clients is "what size sample do I need?". A common misconception is that 400 is the magic number. However, it is not always this easy - one size does not fit all applications. In fact 400 is rarely the right answer. Not surprisingly the answer depends upon the details of the question and understanding the question is the best starting point.
Survey results are often used to find an answer to a question or to help make an informed decision. Sometimes this is expressed in terms of estimating a number such as the proportion of shoppers who might buy a product, the proportion of customers who are satisfied with a service, the average turnover of companies or the gold grade in a deposit. Major decisions can be based on such survey estimates and clearly reliable decisions need reliable estimates.
However any results that come from a survey will be subject to some degree of error. This error can be separated into two types:
- Sampling error. This type of error is caused by surveying only some of the population rather than surveying all of the population. If you repeat the survey, but randomly choose a different group of units to include in the sample, you would expect to receive a slightly different answer simply by virtue of surveying these different units. Both are equally valid answers but both have a degree of uncertainty.
There is a well developed statistical theory that helps us understand this type of error. The theory is used when setting the sample size and choosing the sampling design. The error can often be readily quantified (often before the survey) which helps when choosing the most appropriate sample size and sampling design.
- Non-sampling error. This refers to all other sources of error. Examples include leading questions, communication error, ambiguous questions, data entry errors, poorly defined populations, non-response and deliberately false answers.
Usually this type of error can't be quantified, but steps can be taken to minimise its effects. Having a good and clear questionnaire is the first step. It is good practice to have questionnaires tested before the survey begins, so that these sources of error can be identified and fixed.
It is important to consider both types of error when designing a survey. Any benefit achieved from reducing the size of one type of error can very easily be wasted if the other type of error is larger.