Graduate students

These are my graduate students. I'm proud of them. I try to keep them very busy. :-)

Shaoqun Wu is seeking automatic ways to construct lexical acquisition systems based on a collection of text that teachers or learners provide. She is looking at how to identify important and typical lexical items for language learning from a given corpus by using human language, artificial intelligence, and information retrieval technologies. She is exploring the pedagogical value of these lexical items and using them to construct a computer assisted learning system that facilitates lexical acquisition.

You can play with some of the language games she has developed in our Flexible Language Acquisition (FLAX) project.

Michael Walmsley is designing second language learning activities that utilise time spent reading online to refresh and extend your language skills. He is aiming to combine research in second language acquisition and computational linguistics to develop systems for automatically creating vocabulary lists and second-language reading texts from document collections. The lists and texts must be tailored specifically to the interests, goals and abilities of individual learners.

Software for Japanese and Spanish language learners will be developed to evaluate the systems and activities in longitudinal user studies.

Anna Huang is interested in document clustering. She is investigating algorithms for interactive clustering, and the scatter-gather approach for facilitating the browsing of digital library collections. She is also interested in improving the efficiency of clustering, particularly incremental clustering, by using keyphrases extracted from free texts as descriptors of the document content. She would like to investigate the potential theoretical and practical performance improvements due to the use of automatically extracted keyphrases as the basic document representation.
Craig Schock is interested in improving the development and maintenance of medium and large-scale software systems. Software systems are subject to constant change pressures and because of this, they must remain flexible. The evolution of a software system is heavily dependent on its structure. The field of network theory has been used to analyze complex systems in a variety of different fields and has shown that specific structural characteristics contribute greatly to the evolvability of the system.

Craig's goal is to validate network theory as a viable mechanism for evaluating the evolvability of software systems.

Veronica Liesaputra is working on our Realistic Book project for her PhD. Her goals are (a) to produce a three-dimensional book model that is natural and interactive; (b) represent the Wikipedia as a huge three-dimensional book; and (c) produce a three-dimensional visualization of a personal digital library.

She has produced a lightweight implementation of realistic books that provides a quick, easy-to-use, and responsive page-turning mechanism, and combines the ability to include hyperlinks and animated media. You can see sample books or make your own from a HTML or PDF file.

Daniel McEnnis's task is simple to describe: separate the world's music into music they will like and music they will not like. The complexity of music recommendation systems has increased rapidly in recent years, drawing upon different sources of information: content analysis, web-mining, social tagging, etc. However, the tools to scientifically evaluate such integrated systems are not readily available; nor are the base algorithms available.

He has produced the Relational Analysis Toolkit, which provides a large library of graph-analysis routines within a framework that seamlessly integrates both flat and graph-based algorithms.

Olena Medelyan is interested in natural language processing techniques applied to information retrieval, information extraction and text mining. Her PhD research is on automatic indexing with controlled vocabularies.

The central hypothesis of her thesis is that

  • with access to domain and general semantic knowledge, computers will index better than humans.

Olena has produced the latest version of the KEA algorithm for keyphrase extraction.

The goal of David Milne's PhD is to develop a framework for automatically generating highly accurate, concise thesauri for domain specific document collections. He hypothesizes that (a) for any document set, automatically-created thesauri suit users' searching needs better than manually defined ones, and (b) appropriately crafted thesauri can be integrated into the searching process to improve retrieval without placing an undesirable cognitive load on the user.

David's Koru is an example of interactive query expansion: a highly responsive web application based on AJAX. He has also produced the Wikipedia Miner toolkit.

Rob Akscyn is proposing a radical shift in software tools for knowledge workers: from highly-fragmented applications using flat data models with pull-down intensive interfaces to lattice-structured hypermedia, rich with knowledge schemas, in concert with extreme direct manipulation user interfaces, digital library search technology, and personal agent assistants.

The central hypothesis of his thesis is that the productivity of knowledge work will be significantly improved by transitioning from current computer-based tools to this new knowledge work paradigm.

Kathryn Hempstalk's PhD topic is Continuous Typist Recognition Using Machine Learning. She is trying to figure out to what extent typists can be identified by the patterns they exhibit while typing at keyboard. She is focused on a setting where the user is continuously monitored, rather than password hardening where only the login identification is monitored.

Kathryn produced the Digital Invisible Ink Toolkit, a Java steganography tool that can hide any sort of file inside a digital image, and the Digital Image Resizer Toy, an implementation of Avidan and Shamir's algorithm for "content aware" image resizing.

Past graduate students

Here are a few of my past graduate students, and where they are now (when known). Most don't seem to have web pages; perhaps I kept them too busy!

Kathryn Hempstalk
2009PhDContinuous typist verification using machine learning
Livestock Improvement Centre, Hamilton, New Zealand

Lin-Yi Chou
2006PhDImproving the performance of hierarchical hidden markov models on information extraction

David Milne
2006MScFrom phrase browsing to interactive query expansion
Computer Science Dept, University of Waikato

Shaoqun Wu
2006MScA language learning digital library
Computer Science Dept, University of Waikato

Angela Mlynarski
2006MScAutomatic text summarization in digital libraries
University of Lethbridge, Canada

Imene Jaballah
2005MCMSDigital libraries for personal information management

Kathy Don
2002MScEfficient phrase hierarchy inference
Computer Science Dept, University of Waikato

Yong Wang
2001PhDA new approach to fitting linear models in high-dimensional spaces
Department of Statistics, University of Auckland

YingYing Wen
2001MPhilText mining using HMM and PPM
School of Computer Science and Software Engineering, Monash University

Eibe Frank
2000PhDPruning decision trees and lists
Computer Science Dept, University of Waikato

Hong Chen
2000MScA new architecture for digital libraries

Gordon Paynter
2000PhDAutomating iterative tasks with programming by demonstration
National Library of New Zealand

Tony Smith
2000PhDN-gram models of agreement in language
1993MScLanguage inference from function words
Computer Science Dept, University of Waikato

Stuart Inglis
1999PhDLossless document image compression
Managing director of ReelTwo

Zane Bray
1999MScUsing language models for generic entity extraction

Jamie Littin
1996MCMSLearning relational ripple-down rules
Information Technology Services, University of Waikato

Craig Nevill-Manning
1996PhDInferring sequential structure
Engineering Director at Google

Matt Humphrey
1996PhDA graphical notation for the design of information visualisations
Information Visualization Technology, Stafford, VA

Brent Martin
1995 MSc Instance-based learning: nearest neighbour with generalisation
Department of Computer Science, University of Canterbury

David Maulsby
1994PhDInstructible agents
1988MScInducing procedures interactively: adventures with Metamouse
24C Group Inc., Calgary, Canada

Thong Phan
1994PhDFunction induction
1989MScThe equal-value search: accelerating search in function induction

Abdul Saheed
1993MCMSProcessing textual images
Walker Architects, New Zealand

Anja Haman
1992MScDeformation-based modeling
atlargemedia, Canada

Brent Krawchuk
1992 MSc Inductive theorem generation

Darrell Conklin
1990MScPrediction and entropy of music
School of Informatics, City University, London

Antonija Mitrovic
1990MScInteractive induction of procedures
Department of Computer Science, University of Canterbury

Dan Mo
1989MScLearning text editing procedures from examples

John Darragh
1988MScAdaptive predictive text generation and the Reactive Keyboard

Saul Greenberg
1988PhDTool use, re-use, and organization in command-driven interfaces
1984MScUser modeling in interactive computer systems

Mike Bonham
1985MScViewing and formatting documents on-line

Adrian Zissos
1985MScGenerating advice by monitoring user behaviour

Roy Masrani
1985MScConceptual analysis in Prolog

Rod Cuff
1982PhDDatabase query using menus and natural language fragments

1979 MSc Database query systems for the casual user

John Yardley
1981PhDAutomatic construction of word vocabularies for connected speech recognition

John Foster
1980MScA C cross-compiler for the 8086
Department of Electrical and Computer Engineering, University of Essex

Franklin Ha
1980MScLow bit-rate facsimile transmission of handwriting

John Abbess
1978MScA microprocessor-based speech synthesis by rule system

Angela Corbett
1974MScA telephone enquiry service using synthetic speech

Stephen Crocker
1974MScA personal computer terminal using packet switching