The Perth And Regions Travel Survey (PARTS) was a four-year study of day-to-day travel patterns of residents in the wider Perth Metropolitan Area. PARTS was conducted by Data Analysis Australia on behalf of the Department for Planning and Infrastructure and Main Roads WA. The survey has implications far beyond transport planning – the depth of information collected can be used for strategic planning in land use, public and private transport networks and locational planning of facilities and services for industries such as health, justice and recreation.
A major challenge in the design of the study was how to effectively and efficiently collect geographically representative and high quality data from households across the wider Perth metropolitan region over four years.
The Data Analysis Australia Approach
We saw this problem as a set of smaller hurdles:
- Compiling a list of all residential addresses or households in the study area from which to draw the sample.
- Testing the procedures and quality of the address list through a pilot survey.
- Designing a sampling regime that ensured geographic coverage across the study area over the four-year study period.
- Optimising data quality through intensive follow-up of respondents and a comprehensive series of manual and automated logic checks throughout data entry and processing.
Compiling a List of Addresses
PARTS was required to collect information from a representative sample of households from the Perth Metropolitan Region and the Shires of Mandurah and Murray. If an exhaustive list of all households within the region had existed, it would have been a simple process to select such a sample. However, although it may not often be realised, there are no such lists publicly available. As a result, there were three options:
- Select the sample in a way that did not make use of a list;
- Select the sample from a partially incomplete list (and accept the biases that arose from this); or
- Develop a sampling list.
Data Analysis Australia chose the last of these options. Our approach was to create a sampling list that was initially too large (by including both residential and non-residential records) and then make this into an accurate sample with the right operational procedures. This approach was made possible because whilst complete lists of households do not exist, complete lists of all land parcels do – the issue is just that they do not accurately identify every individual land parcel as being residential or non-residential. The key to this approach was that instead of retaining only land parcels specifically identified as residential, we retained additional land parcels that were potentially residential and only removed them if they were proven to be non-residential at a later stage.
Our initial list came from the Western Australian Property Street Address (PSA) file, a government database of all land holdings. As the land title system used in Western Australia ensures that this database is absolutely complete in its coverage, and every household must be on some piece of land, it follows that all residential parcels are included on the list, along with all non-residential parcels. This database was then augmented with information from the Water Corporation's Land Use Codes file, to identify land parcels as residential or not wherever possible, and Australian Bureau of Statistics' 2001 Census of Population and Housing data of occupied private dwellings.
This approach minimised wasted effort associated with attempting to sample non-residential addresses, as land parcels that were indisputably non-residential (such as parks and shopping centres) were excluded from the sampling list.
The data augmentation was also used to identify where there were multiple dwellings on a single land parcel. Extra records were added, where necessary, so that each dwelling on a single land parcel had a unique record. For example, if the original PSA file contained only a single record for a land parcel that was identified by the Land Use Codes as containing a block of twenty flats, an additional nineteen records would have been added. This process helped to ensure that flats, units and apartments were not under-sampled, which can be a problem in household surveys.
The pilot study before the main survey highlighted that a mailout/mailback survey would not produce acceptably high quality results. Response rates were lower than anticipated, and the incompleteness of some addresses in the sampling frame made it difficult or even impossible for Australia Post to correctly and consistently deliver the questionnaire packs and reminder letters. The PSA file was simply not accurate enough for a mailout, since the detail required for legally defining land (which does not use street addresses) differs from that required for correct delivery of mail. However Data Analysis Australia saw the need to still use the file since it was the only sampling frame that could provide 100% coverage.
The solution was to change the method of using the addresses. A personal delivery/collection methodology was chosen using our own fieldwork staff with rigorous rules for dealing with incomplete or incorrect addresses. Whilst a majority of households could be readily identified from the information available in the PSA file, a key aspect of the personal delivery was that fieldworkers could be given more information to help them correctly “find” the intended address for more difficult to find addresses, such as households on corner blocks that could have an address using either street name. In particular, maps based upon land boundary data were created for each fieldworker, highlighting the dwellings to be sampled, so that even if the address itself didn't exist, the intended block of land could still be identified. These maps to guide staff were critical in overcoming problems due to imperfect or erroneous street addresses, ensuring reliability of sampling.