Fitting a mixture model to three-mode three-way data with missing information

Lynette A Hunt and Kaye E Basford. Fitting a mixture model to three-mode three-way data with missing information. Journal of Classification, 18(2):209-226, 2001.
[ bib | http ]
When the data consist of certain attributes measured on the same set of items in different situations, they would be described as a three-mode three-way array. A mixture likelihood approach can be implemented to cluster the items (i.e., one of the modes) on the basis of both of the other modes simultaneously (i.e,, the attributes measured in different situations). In this paper, it is shown that this approach can be extended to handle three-mode three-way arrays where some of the data values are missing at random in the sense of Little and Rubin (1987). The methodology is illustrated by clustering the genotypes in a three-way soybean data set where various attributes were measured on genotypes grown in several environments.


Tag Insertion Complexity

Stuart Yeates, Ian H. Witten, and David Bainbridge. Tag insertion complexity. In Data Compression Conference, pages 243-252. IEEE Computer Society, 2001.
[ bib ]

Applications of machine learning in information retrieval

Sally Jo Cunningham, James Littin, and Ian H. Witten. Applications of machine learning in information retrieval. In M. E. Williams, editor, Annual Review of Information Science and Technology, pages 341-419. American Society for Information Science and Technology, 2001.
[ bib ]

Determining Progression in Glaucoma Using Visual Fields

Andrew Turpin, Eibe Frank, Mark Hall, Ian H. Witten, and Chris A. Johnson. Determining progression in glaucoma using visual fields. In Proc 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hong Kong, China, pages 136-147. Springer, 2001.
[ bib | .ps | .pdf ]
The standardized visual field assessment, which measures visual function in 76 locations of the central visual area, is an important diagnostic tool in the treatment of the eye disease glaucoma. It helps determine whether the disease is stable or progressing towards blindness, with important implications for treatment. Automatic techniques to classify patients based on this assessment have had limited success, primarily due to the high variability of individual visual field measurements. The purpose of this paper is to describe the problem of visual field classification to the data mining community, and assess the success of datamining techniques on it. Preliminary results show that machine learning methods rival existing techniques for predicting whether glaucoma is progressing - though we have not yet been able to demonstrate improvements that are statistically significant. It is likely that further improvement is possible, and we encourage others to work on this important practical data mining problem.


Optimizing the Induction of Alternating Decision Trees

Bernhard Pfahringer, Geoffrey Holmes, and Richard Kirkby. Optimizing the induction of alternating decision trees. In D. Cheung, G.J. Williams, and Q. Li, editors, Proc 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 477-487. Springer, April 2001.
[ bib | .ps | .pdf ]
The alternating decision tree brings comprehensibility to the performance enhancing capabilities of boosting. A single interpretable tree is induced wherein knowledge is distributed across the nodes and multiple paths are traversed to form predictions. The complexity of the algorithm is quadratic in the number of boosting iterations and this makes it unsuitable for larger knowledge discovery in database tasks. In this paper we explore various heuristic methods for reducing this complexity while maintaining the performance characteristics of the original algorithm. In experiments using standard, artificial and knowledge discovery datasets we show that a range of heuristic methods with log linear complexity are capable of achieving similar performance to the original method. Of these methods, the random walk heuristic is seen to outperform all others as the number of boosting iterations increases. The average case complexity of this method is linear.


Prediction of Ordinal Classes Using Regression Trees

Stefan Kramer, Gerhard Widmer, Bernhard Pfahringer, and Michael de Groeve. Prediction of ordinal classes using regression trees. Fundam. Inform., 47(1-2):1-13, 2001.
[ bib | .ps | .pdf ]
This paper is devoted to the problem of learning to predict ordinal (i.e., ordered discrete) classes using classification and regression trees. We start with S-CART, a tree induction algorithm, and study various ways of transforming it into a learner for ordinal classification tasks. These algorithm variants are compared on a number of benchmark data sets to verify the relative strengths and weaknesses of the strategies and to study the trade-off between optimal categorical classification accuracy (hit rate) and minimum distance-based error. Preliminary results indicate that this is a promising avenue towards algorithms that combine aspects of classification and regression.


A Simple Approach to Ordinal Classification

Eibe Frank and Mark Hall. A simple approach to ordinal classification. Technical Report 01/05, Department of Computer Science, University of Waikato, 2001.
[ bib | .ps | .pdf ]
This is an updated version of a paper with the same title that appeared at the European Conference on Machine Learning 2001, Freiburg, Germany. Springer-Verlag, pp. 145-165.

Machine learning methods for classification problems commonly assume that the class values are unordered. However, in many practical applications the class values do exhibit a natural order - for example, when learning how to grade. The standard approach to ordinal classification converts the class value into a numeric quantity and applies a regression learner to the transformed data, translating the output back into a discrete class value in a post-processing step. A disadvantage of this method is that it can only be applied in conjunction with a regression scheme. In this paper we present a simple method that enables standard classification algorithms to make use of ordering information in class attributes. By applying it in conjunction with a decision tree learner we show that it outperforms the naive approach, which treats the class values as an unordered set. Compared to special-purpose algorithms for ordinal classification our method has the advantage that it can be applied without any modification to the underlying learning scheme.


A Simple Approach to Ordinal Classification

Eibe Frank and Mark Hall. A simple approach to ordinal classification. In Proc 12th European Conference on Machine Learning, Freiburg, Germany, pages 145-156. Springer, 2001. Note: there is a small bug in the description of the algorithm. Please consult [7] instead.
[ bib ]

Interactive machine learning: letting users build classifiers

Malcolm Ware, Eibe Frank, Geoffrey Holmes, Mark Hall, and Ian H. Witten. Interactive machine learning: letting users build classifiers. Int. J. Hum.-Comput. Stud., 55(3):281-292, 2001.
[ bib | .ps | .pdf ]
According to standard procedure, building a classifier using machine learning is a fully automated process that follows the preparation of training data by a domain expert. In contrast, interactive machine learning engages users in actually generating the classifier themselves. This offers a natural way of integrating background knowledge into the modeling stage - so long as interactive tools can be designed that support efficient and effective communication. This paper shows that appropriate techniques can empower users to create models that compete with classifiers built by state-of-the-art learning algorithms. It demonstrates that users - even users who are not domain experts - can often construct good classifiers, without any help from a learning algorithm, using a simple two-dimensional visual interface. Experiments on real data demonstrate that, not surprisingly, success hinges on the domain: if a few attributes can support good predictions, users generate accurate classifiers, whereas domains with many high-order attribute interactions favor standard machine learning techniques. We also present an artificial example where domain knowledge allows an expert user to create a much more accurate model than automatic learning algorithms. These results indicate that our system has the potential to produce highly accurate classifiers in the hands of a domain expert who has a strong interest in the domain and therefore some insights into how to partition the data. Moreover, small expert-defined models offer the additional advantage that they will generally be more intelligible than those generated by automatic techniques.


(The Futility of) Trying to Predict Carcinogenicity of Chemical Compounds

B. Pfahringer. (the futility of) trying to predict carcinogenicity of chemical compounds. In The Predictive Toxicology Challenge Workshop, Twelfth European Conference on Machine Learning (ECML2001), Freiburg, Germany, 2001.
[ bib | .ps | .pdf ]
This paper describes my submission to one of the sub-problems formulated for the Predictive Toxicology Challenge 2001. The challenge is to predict the carcinogenicity of chemicals based on structural information only. I have only tackled such predictions for bioessays involving male rats. As we currently do not know the true predictions for the testset, all we can say is that one of the models supplied by us seems to be optimal over some subrange of the ROC spectrum. The successful model uses a voting approach based on most of the sets of structural features made available by various other contestants as well as the organizers in an earlier phase of the Challenge. The WEKA Machine Learning workbench served as the core learning utility. Based on a preliminary examination of our submission we conclude that reliable prediction of carcinogenicity is still a far away goal.


Wrapping Boosters against Noise

Bernhard Pfahringer, Geoffrey Holmes, and Gabi Schmidberger. Wrapping boosters against noise. In Proc 14th Australian Joint Conference on Artificial Intelligence, pages 402-413. Springer, 2001.
[ bib | .pdf | .ps ]
Wrappers have recently been used to obtain parameter optimizations for learning algorithms. In this paper we investigate the use of a wrapper for estimating the correct number of boosting ensembles in the presence of class noise. Contrary to the naive approach that would be quadratic in the number of boosting iterations, the incremental algorithm described is linear.

Additionally, directly using the k-sized ensembles generated during k-fold cross-validation search for prediction usually results in further improvements in classification performance. This improvement can be attributed to the reduction of variance due to averaging k ensembles instead of using only one ensemble. Consequently, cross-validation in the way we use it here, termed wrapping, can be viewed as yet another ensemble learner similar in spirit to bagging but also somewhat related to stacking.


Investigation of association models to describe consumer purchase patterns

Mark Andrew Hall, N. J. Kusabs, D. Gillgren, and A. F. Bollen. Investigation of association models to describe consumer purchase patterns. In Proc International Symposium on Applications of Modeling as an Innovative Technology in the Agri-Food-Chain, pages 167-173. ISHS, 2001.
[ bib ]

Food Process Modelling

G. Holmes and T. Smith. Food Process Modelling, chapter Data mining. Woodhead Publishing Ltd, Cambridge, UK, 2001.
[ bib ]