Machine Learning Group

Eibe Frank, Geoff Holmes, Michael Mayo, Tony Smith, Ian Witten

Machine learning is concerned with the task of automatically extracting useful information from data. The aim is to identify patterns that can be used to understand the domain from which the data was collected and to make predictions. Consider the task of constructing a spam filter for email messages. Instead of laboriously creating a handcrafted set of filtering rules, we can use machine learning to extract patterns that differentiate spam from ham, based solely on a collection of messages that have been labelled as spam and ham respectively, and then use those patterns in the filter. Consider the task of understanding customer preferences by mining for interesting patterns in supermarket checkout data. There are efficient algorithms that can automatically extract those patterns in the form of simple if-then rules. Given the availability of vast amounts of raw data in electronic form, there is a plethora of applications for machine learning techniques.

Our team works on new, more efficient, algorithms for machine learning and data mining as well as new applications. We are well known for a software "workbench" called the Waikato Environment for Knowledge Analysis (WEKA), which contains a large number of machine learning techniques. WEKA is widely used for research, teaching, and commercial applications of machine learning. It has been used to help determine what information dairy farmers use in deciding which cows to keep in their herds, been applied to bioinformatics problems such as gene interaction discovery, and been used for many other applications such as mining supermarket transaction data for high profit product associations, predicting the levels of chemicals like nitrogen and carbon in soils to aid farmers' fertilizer decisions, and processing natural language to extract keywords from documents.

Further details on the group can be found at: