Today's organisations are data centric. Information capture is becoming easier and storage limits have effectively disappeared. Databases are at the heart of the organisation and often contain critical information on customers, sales, operations and their history. Without this information the organisation may stop functioning - it is a critical asset. But unlike other supposedly more real assets, this one does not appear on the balance sheet.
One reason is that the value of information to an organisation depends upon its relevance and quality. And the unfortunate experience of many organisations when faced with challenges is that the quality is found to be not good enough or the right data has not been collected. This is sometimes discovered when a major upgrade or redesign is carried out - the data simply does not match the specifications so it is difficult or impossible to incorporate into the new system. Sometimes the data is needed in an emergency and vital parts are found to be missing.
An example is a simple customer name and address file likely to be found in a Customer Relationship Management (CRM) system. This information degrades over time as people move and the updates are not made. Names and addresses are often recorded in a free text form with little or no structural checking. This works well most of the time - just think of letters you have received even though they with incorrect addresses - since the human postman can often interpret technically wrong details and use other information to overcome the problems. However computerised systems are usually less forgiving.
Data Analysis Australia has recognised that a probabilistic or statistical approach is best at assessing just how good data is, capturing it more effectively and at making sense of it. It is also necessary to have a strategic approach to the IT systems and processes to ensure that they assist rather than hinder the maintenance of quality data. For this purpose we have formed an alliance with Iain Massey and Associates, combining the best of analysis with the best of strategic IT planning to provide an information audit service. An example of this collaborative work is a recent review of "Names Database" at the heart of the CRM system used by the Insurance Commission of WA. The project required both a review of the data itself but also consideration of the data capture process and the way the data was used.
A similar approach has been used when Data Analysis Australia has worked with telecommunications consultants Gibson Quai in the audit of the Integrated Public Number Database (IPND) for the Australian Communications and Media Authority. Here a major issue was the shear size of the problem - comparing over 40 million records in the IPND against a reference set of over 30 million. A cluster of high speed computers together with clever algorithms was used to do this in reasonable time.
Data Analysis Australia uses these approaches internally. An example of this approach is found in data capture systems of the Perth and Regions Travel Survey. Large volumes of imperfect address information is collected in self-completion surveys and had to be converted to a reliable geocoded form. We developed smart probabilistic algorithms that compared an address being captured with every possible Perth address - almost one million - and determined the best fit using a Bayesian probability criterion that allows for incompleteness, spelling and other errors. This happens in real time as data entry takes place, again requiring smart efficient algorithms. A quality index is also given so that if the data is ambiguous it can be flagged for immediate follow up.
The lesson from these experiences is that, as organisations rely more upon large computerised databases, it is essential that they focus on exactly what they are storing in them. By themselves computers do not solve the problems, they merely change them.
For further information on this and particularly on our information audit service, please contact Data Analysis Australia at daa(at)daa.com.au or phone 08 9468 2533.