Skip to main content Skip to page footer

Big Data and Machine Learning

In many situations the problem is not a lack of data but rather an apparent excess of data. Even in this situation, the question still remains "what does it all mean"? Interpreting large volumes of data and finding the information gems is the challenge. At Data Analysis Australia we have the IT and statistical capability to store and process such data sets, to provide useful and valuable information to the client.

Data Analysis Australia approaches this from a statistical viewpoint, but one that is informed by modern computer science so that the two are incorporated to fully disseminate the data, providing the client with real insight into "what the data means". This gives several advantages over approaches that are essentially computer technology driven:

  • Statistics provide the most effective methods of understanding relationships in data. While these are often standard to statisticians, their application to large datasets requires special algorithms and an understanding of how the data is stored.

  • Statistics provide measures of significance for what is found. This is critical since data mining methods may throw up many chance patterns and it is important to discard those that are not real.

  • Statistics provide a methodology for exploration using subsets of the data, often saving enormous amounts of computer time that might be prohibitive on many operational systems.

Systems at Data Analysis Australia are optimised for handling large datasets - we have capacity to handle databases measured in hundreds of gigabytes and our high bandwidth network means that computation is almost unlimited. A range of software tools is available so that the best can be chosen for each task or step in an analysis.

Data Analysis Australia is frequently consulted by organisations that need to make the most effective use of their data. Since our focus is on information content and what information is required to drive decisions, our expertise complements that of more traditional IT companies.


The Conscience of Algorithms

How does AI impact people?

There has been much written over the years on how Artificial Intelligence (AI) and Machine Learning (ML) will impact our lives, often polarised in context.  On one hand, computers and intelligent robotics are able to perform dangerous jobs and tasks (saving lives), act as virtual assistants, and provide automated transportation, just to name a few.  On the flipside, automation will result in heavy loss of “old” economy jobs with much of the skilled workforce becoming obsolete and needing to retrain, or even that AI will be mankind’s ultimate downfall.  The reality is that automation and intelligent machines are already here.  While the world is approaching the decision make processes of driver-less vehicles with caution, other algorithms are already out there, shaping our experiences as individuals and as a society.  It’s time to ask what their impact is.  Should there be Regulation of the algorithm?

Data Mining

Statistics, or Something Else?

Statisticians have very mixed views on data mining. At one stage it was a term of derision. To mine data suggested digging into data so much that something was bound to be found, without regard as to whether it was there purely by chance. The implication was that the mining proceeded until what was found fitted preconceived ideas. Formal statistics had been developed precisely to overcome such dangers and every undergraduate student of statistics was taught to define hypotheses to be tested before looking at the data.


Related Case Studies


Profiles of Power

The Problem: Horizon Power needed to understand typical electricity consumption patterns of residential customers to assess the effect of potential changes in customer behaviour.

The Data Analysis Australia Approach: To simplify a large data set enabling the identification and understanding of different groups of customers within the data set, based purely on their electricity consumption and using a modelling approach to estimate the year-long pattern for each customer group.

The Result: A set of typical consumption patterns for different types of residential customers that can be used for ongoing ‘what if’ investigations of many different types of customer behaviour change to assist Horizon Power in their short and long term strategic planning.


Statistical Data Science



Analyse, visualise, and model data using the latest statistical and data science techniques

Surveys



Develop, carry out and analyse surveys to understand perception and find business insights

Forecasting and Prediction



Discover trends and predict the future with data

Spatial Analysis and Mapping



Learn how data is spatially correlated to inform strategy

Business and Risk Analysis



Understand how risk and uncertainty can be minimised in business decisions

Big Data and Machine Learning



Uncover trends and relationships to gain valuable insights from big data

Simulation and Optimisation



Determine the optimal way to operate in the future

Mining Analytics



Improve processes and uncover new insights

Interactive Dashboards



Make informed decisions from real-time data with intuitive visuals and information