Data Analysis AustraliaSTRATEGIC INFORMATION CONSULTANTS
|
|
Copyright © 2013
Data Analysis Australia |
![]() Data Analysis Australia has created a series of short papers titled Analytical Ideas, which discuss developments that are relevant to our clients and provide an insight into our approach to solving problems. A range of topics have already been covered and new papers will be added on a regular basis.
Sampling and Weighting – A Better Practice Guide for Survey Practitioners As part of its commitment to encouraging the use of scientific methods, the Australian Market and Social Research Society (AMSRS) recently commissioned Data Analysis Australia to develop the Society’s newest professional development resource, Sampling Design and Weighting for Australian Household/Consumer and Business Surveys - A Better Practice Guide. Covering a suite of topics including formal sampling design concepts and terminology, sample sizes, methods of selecting a sample and weighting of sample data to reflect the full population, the Guide focuses on the practical implementation of surveys. Linear Mixed Models, REML and the Utilities - Utilities such as electricity, gas and water collect huge amounts of customer billing and consumption data. Data Analysis Australia has been involved in many projects to help the utilities understand usage patterns, trends over time and to help predict future requirements. Large amounts of data, many variables and correlation structure within the data make this a complex task. One way to allow for correlations within the data is to incorporate random terms into the statistical models, resulting in linear mixed effects models instead of fixed effects models. A preferred estimation method for fitting linear mixed models is Residual Maximum Likelihood (REML), sometimes called Restricted Maximum Likelihood. Data Analysis Australia has the expertise to recognise where linear mixed models should be used and to carry out and interpret REML analyses, leading to improved statistical models and better analytical results for clients. See the AI article for a description of what linear mixed models are and how their use can benefit our clients. Experimental Design and the Football Draw- When decisions are needed requiring comparisons - for example between sporting teams or players, medical treatments, farming practices, industrial processes, recipe modifications, routes to work, or fuel types - a controlled experiment is often the ideal way to obtain reliable data. But how can we ensure that all are compared fairly, without favour? How can we have confidence that, if repeated, we would get a similar result? The specialist field of experimental design has techniques to address these issues, and staff at Data Analysis Australia have training and experience in applying them. The AI article describes some of the relevant concepts Excel Tool Design - An important element of the work done at Data Analysis Australia is being able to translate the resulting information into a form that is useful to the client. This is particularly important with large or complex data or where sophisticated statistical methodologies have been used. Although written reports are the most common, increasingly we are finding an effective means of communicating and translating analysis results is by creating an Excel tool. Custom-designed, these can be readily understood and used by both the client and their target audience. Most importantly the tools allow our clients to interact and make changes to the parameters at the click of a button, while out of view sits the statistical methodology that Data Analysis Australia has employed. The full Analytical Ideas article describes some of the tools designed for our clients, what sits behind these tools, and how they provide efficient and useful results. Precision in Recording and Reporting Data - To what level of precision should data be recorded, analysed and reported? It has long been common practice to round observations at the time of recording but this approach can lead to the loss of valuable information. On the other hand, reporting figures to multiple decimal places can lead to a false impression of the accuracy of the measurement. The problem is that the recording and reporting of data has become confused. There is a need to carefully consider the separation of recording and reporting data, with a view of how to handle the uncertainty in the data. Data Analysis Australia discusses why this has happened historically and how this can be addressed today, using the reporting of arsenic levels in soil as an example. Weather Statistics: Are Averages Useful? - On the 1st December 2010, the ABC reported that, according to the Bureau Of Meteorology (BOM), Australia had a bumper spring for rainfall, 160mm on average across the land - about 100mm more than last year. Fantastic news...so why is not everyone happy about this? The BOM report also included a statement that the South West of Western Australia not only had its warmest spring on record but also its driest year on record. So what information, if any, is the average rainfall for Australia giving us? In fact it appears that the average rainfall data in fact masks more important and interesting information. This article puts forward a discussion using the spring rainfall averages and possible ways that rainfall data can be calculated to be more meaningful. What Size Sample Do I Need? - Our latest Analytical Article discusses the issues that need to be considered when selecting an appropriate sample size. We are often asked by our clients - "What size sample do I need?". There are a number of sampling methods that can be used but a thorough understanding of our clients' requirements plus in depth knowledge of sampling methods is required to get the best value. Data Analysis Australia can help our clients in choosing the most appropriate sampling method which can often lead to a reduction in sample size while still maintaining the correct significance. This article outlines a number of sampling methods including simple random sampling, stratified, cluster and acceptance sampling as well as why 400 is not always the magic number. Statistics in the Media - Statistics - meaning numbers from various sources - are often quoted in the media as if they have a clear interpretation, usually one supporting a particular point of view. Professional statisticians realise that such statistics are rarely as simple as they may at first appear. Checking where the data comes from, how it was collected and what it really measures often leads to a very different view of what the statistics mean. Researching sources and understanding their reliability is something that professional journalists are expected to do all the time, so it is reasonable to ask why they do not do it with numbers. This Analytical Ideas article explores two real examples where the story behind the numbers reveals a very different truth. Databases and Statisticians - Statisticians work with data. Database professionals manage data. It would seem that the two groups would have much to talk about and should be seen assisting each other. At Data Analysis Australia we have both statisticians and computer scientists working together, but our experience suggests that our approach is not common. In this Analytical Ideas article, John Henstridge gives his perspective on this, based on over thirty years involvement in statistical computing. He reviews how statisticians and computer scientists reacted in different ways to the development of computers since the 1950s, leading to different ways (and different vocabularies) of approaching the same problems. He suggests that each side needs to listen to the other to avoid wasteful reinvention of ideas. Statistical Graphics - Charts, diagrams and graphs have been used to present statistical data and results for centuries. The purpose is to present a finding or a summary of information in a manner where it can be readily understood. However they also have a dark side, where graphics are used to present a particular view or a biased interpretation of the data. Political Polls - Confidence in political polls is a belief that the results will be close to the actual election. The recent election has shown that not all polls deserve our confidence. This highlights the issue of why do some polls perform better than others? Questionnaire Validity - Due to the changing nature of the workforce, many organisations are required to undertake some forward planning to minimise any potential impacts relating to loss of staff. Statisticians are required to use a variety of investigative techniques to ease these impacts, including analysis of specifically designed surveys and forecasting models. Cricket Scoring - The adjustment of cricket scores when play is interrupted is a statistical or mathematical problem. The standard in use today - the Duckworth-Lewis Method - was developed by two statisticians. This article explains how a simple mathematical model is used together with empirical data to solve this problem. Response Rates - Survey results can be misleading! One way that many people are lulled into a false sense of security is when surveys quote a high "response rate". But what exactly is a response rate? And how does it influence the quality of the survey results? Lawyers and Statisticians are often seen as opposites. However, in this Analytical Ideas article John Henstridge presents the argument that they have much in common. A conclusion reached during extensive experience as an expert witness in court. Data Mining is an area commonly talked about today but rarely defined. Another paper in our Analytical Ideas series presents the statistician's view of Data Mining and argues that statisticians are uniquely placed to utilise these techniques. Other papers in this series are: |