The recent Census that was completed by our Australian readers achieved more publicity than usual due to problems with the online implementation. There were also questions raised before the Census itself on how the data would be used, including it being “integrated” with other data and used “longitudinally”. It is worth considering what these terms mean and how they affect us.
The Census itself collects only a limited amount of data on each person – primarily their age, gender, family status, education and work. The power of the Census comes not from the depth of the data about each person, but from its breadth – everyone is expected to respond. Statisticians are constantly exploring ways of providing better breadth and combining with other data is one way of doing this. The terms data integration and data linkage refer to when this is achieved by matching individual records in one data set to records in another, often using identifying information. This raises both technical and ethical issues, but there can be major statistical benefits, making it worthwhile trying to resolve such problems.
Traditionally, Census data from an individual or household has not been connected from one Census to the next and has only been available for “cross-sectional” analysis, so changes are only measured at an aggregate level. However, by linking records of individuals between Censuses, a “longitudinal” dataset is created, giving measures of change at the individual level, not just in population levels. It is then possible, for example, to measure not just how employment rates have changed but break this change down into persons moving out of employment and others moving into employment.