Surveys for Measuring Change

Many surveys are one-off surveys that simply aim to measure a single point in time. However, sometimes the aim is to measure and understand change, which leads to far more complex survey design issues. Often the change is associated with some form of intervention or change in policy but sometimes it is intended to measure a trend. The need to measure change typically leads to three very different approaches to conducting a survey: 

  • Retrospective. Survey respondents are asked about their current and previous status or whether their status has changed. In the case of a survey measuring the effect of an intervention or policy change, the survey will be carried out after the intervention or policy change has taken place. The key aspect of this type of design is that the change is measured using a single survey.
  • Double cross-sectional. Two similar surveys are conducted - one before the intervention or policy change and one after it. Different samples are used for the two surveys.
  • Longitudinal. Two surveys are conducted - one before the intervention or policy change and one after it. The key difference to a double cross sectional design is that the same sample is used for the two surveys.

Each of these approaches has strengths and weaknesses and the best choice will depend upon the context and purpose of the individual survey. For comparison purposes, we consider the situation of measuring change before and after an intervention throughout the remainder of this article.

Retrospective Designs

The greatest attraction of a retrospective design is usually cost. Only one survey is conducted and often it can be done relatively simply and hence at a low cost.

The great weakness is that it can sometimes be very misleading, depending upon what is being measured. For example, if a survey concerning a road safety intervention asked drivers "Do you drive more safely than you did a year ago?" then there is a tendency for the answer to be "yes", since people do not like to admit that they are less conscious of safety issues. Similarly, if the survey was measuring a change in attitudes, it may not be reasonable to expect that people can accurately remember, or be willing to say, what they thought a year ago. These problems severely limit how retrospective surveys are used.

So when can a retrospective survey work? Generally, a retrospective survey works best in situations where the item being measured is objective rather than subjective and where respondents are not required to recall information that they may have forgotten. Ideally, respondents may even have kept a record of their previous and current status so that both can be recorded with the same level of objectivity and accuracy. For example, with financial matters, this might mean referring to bills or statements.

Double Cross-Sectional Designs

Two separate surveys using similar or identical questionnaires and identical sampling methods can very reliably measure change, even if the survey itself may have some biases. Any biases will simply cancel out when looking at the differences, provided that care it taken to keep the surveys as similar as is possible.

This type of survey is usually substantially more expensive than a retrospective design and the impact on cost is often more serious than first understood. Not only is there the cost of forming two samples and conducting two surveys, but in order to measure changes with the same degree of reliability as obtaining a one-off measure at a single point in time, each of the surveys needs to be of a larger size.

The reason the surveys need to be of larger size is due to the way that standard errors combine with each other when measuring a change. The effect of this is that each survey will often need to be double the size of a single survey to achieve the same level of accuracy.

For example, suppose that to achieve an accuracy of 5% on one survey, a sample of size 1,000 is required. Then to achieve an accuracy in measuring a change of 5%, each survey will need to have a sample of size 2,000 (that is, a total sample size of 4,000 spread across the two surveys).

Longitudinal Designs

A longitudinal design also has two surveys but the second survey uses the same sample as the first, rather than a separate sample. This allows responses to be compared on an individual basis which can add a great deal of flexibility to the analysis. It may also allow the same degree of accuracy to be obtained with a much smaller sample size than that required in a double cross-sectional design. Because the sample is chosen before the intervention, this is sometimes called a prospective design, in contrast to the retrospective design discussed above.

There are two primary reasons why a longitudinal design might be worth doing:

  • It is possible to characterise the types of people who change their responses between the surveys, possibly due to the intervention. Importantly, changes in both directions can be observed whereas in a double cross sectional design only the overall change can be measured.
  • It can often lead to substantial savings in the sample size required. This is because there is often a positive association or correlation between the responses of an individual in the two surveys - when looking at the differences this common component cancels out. This sample size benefit is the greatest when there are many people who do not change.

These benefits do not come without a cost. Often it takes considerable effort (and expense) to track down the sample for the second survey and it would usually be impossible to resurvey each and every respondent in the second survey. The difficulties in tracking down respondents include people who have moved or are on holidays. Other respondents may simply have changed their mind about being in the survey and refuse to participate in the second survey. This 'attrition' of the sample can lead to biases in the results, however there are statistical techniques which can be applied to minimise the effect of these biases.

There is also a more subtle cost which impacts on the survey results. It is possible that the experience of the first survey may affect their behaviour and thoughts, thus affecting their responses to the second survey. This may create biases in the survey results which are not easily removed.

Which Design is Best?

Overall there is no right or wrong design as the most appropriate design depends largely upon the context, including what the specific purposes of the research are. Data Analysis Australia has at various stages used all three designs, sometimes using a combination of the designs for the same research project.

Data Analysis Australia assisted in the design of the Household, Income and Labour Dynamics in Australia (HILDA) Survey which is conducted by the Melbourne Institute of Applied Economic and Social Research. This survey measures changes in households over time, giving more detail than is provided by regular labour force surveys that are essentially cross-sectional. A longitudinal design is used, tracking households over many years.

One of the purposes of the Perth and Regions Travel Survey (PARTS) is to measure changes in activity and travel patterns. Here one of the concerns is with measuring changes at the aggregate level over many years. The design is effectively cross-sectional, with continuous sampling over time.

However when measuring the effect of a highly specific intervention due to the TravelSmart program, a modification to PARTS was introduced to incorporate a longitudinal design. In fact, not only was the sample of households the same before and after the TravelSmart intervention, each household was surveyed for the same day of the week each time and at the same time of year. This illustrates the central concept of many longitudinal designs - keeping everything as constant as possible except for the intervention so that its effects can be properly measured. The benefit in this survey was that the sample size could be reduced to less than half the size that would have been required for a double cross-sectional design.

A survey to investigate the effect of Application Service Provider (ASP) technology in schools used a mix of cross-sectional, longitudinal and retrospective designs. The retrospective components were interesting in two respects. First there was a need to assess some changes that had taken place before the initial survey. This could only be done retrospectively since it was impossible to define the sample beforehand. Secondly, some retrospective questions were included in the second survey to ask about perceived changes since the first survey. These perceptions of changed could then be contrasted to the changes recorded by the longitudinal component.