Waikato University crest

Department of
Computer Science
Tari Rorohiko

Computing and Mathematical Sciences

Seminars Presented in 2006

Events Index

A light at the end of the funnel? Data analysis in Proteomics mass spectrometry experiments

Marc Kirchner
University of Heidelberg
Monday, 4th December 2006
Proteomics is the study of structure, function, dynamics and relationships of all protein present in an organism. Often being considered the next logical step following Genomics, it faces highly complex, challenging situations due to ever-present chemical interactions in the proteome. In this field, mass spectrometry has become one of the most important tools because of its unmatched sensitivity and potential for high throughput analysis. New developments in analyzer instrumentation and chemical preparation techniques have led to a rapid increase in the amount of data that is generated in each experiment, and we currently face a lack of suitable automated analysis methods, specifically tailored to mass spectrometry data.

In general, data sets are huge, high-dimensional, sparse, and hold a multitude of technical challenges, including, but not limited to, normalization, feature extraction, correspondence estimation, significance analysis (biomarker detection), quality assessment and classification, and quantitation. Here, methods from statistical learning, pattern recognition, and data mining are natural candidates for tackling many of the problems at hand.

The talk intends to give an introduction into the biological relevance of MS data and into the types of data behind MS analysis, followed by an overview over our own work in the field so far. The main focus is on feature extraction (peak picking), unsupervised alignment solutions, and (tentative) quantitative approaches.


Experiences around the use of Situated Digital Displays to Support Coordination and Notions of Community

Keith Cheverst
Lancaster University, UK
Monday, 27th November 2006
In this talk I will describe our experiences with the design, deployment and evaluation of situated display based systems that support coordination and notions of community. I'll discuss the Hermes office door display system which supports coordination between staff and staff and students ostensibly through the display of awareness related messages such as 'Sorry, running late - stuck in bad traffic' or 'gone for lunch...'. I'll also discuss the 'Hermes at Home' system which supports awareness (again through messaging) between members of a home. Person(s) 'away' from the home can send messages via a web portal to an 'information appliance' style display situated in the home, while people at home can scribble messages on the touch sensitive display of the 'always on' unit for reception by the person(s) away from the home. The system has been conceived as a technology probe and serves as a tool in investigating related issues such as awareness and intimacy between home inhabitants.

If there is time, I'll also present the findings of our initial studies into the adoption of a communal photo display (also conceived as a technology probe) to explore notions of community in a rural village in the North of England - the photo display is currently situated in the local post-office and you can view its content in real-time here:

The research has been undertaken as part of the EPSRC funded CASIDE project and for more details please visit: http://www.caside.lancs.ac.uk


Two Digital projects at the National Library of New Zealand

Gordon Paynter
National Library of New Zealand
Tuesday, 21st November 2006
Gordon Paynter, who earned his BCMS and PhD from the Department of Computer Science and worked for the Digital Library project, returns to talk about two projects he has worked on in his first year at the National Library of New Zealand.

The Web Curator Tool Project
More and more of New Zealand's documentary heritage is only available online. Users find this content valuable and convenient, but its impermanence, lack of clear ownership, and dynamic nature pose significant challenges to any who attempt to acquire and preserve it. To solve these problems, the National Library and The British Library initiated a project to design and build an open-source Web Curator Tool to support selective web harvesting: where a selector identifies parts of or whole websites for collection, harvests it, reviews it, and archives it. This presentation describes selective web harvesting, the project, and the tool itself.

Using OCR and Greenstone to build a searchable newspaper collection
Papers Past is one of the largest digitised historic newspaper collections in the world, comprising over 1 million pages from around New Zealand. Although it is one of the National Library's more popular products, the website is more than a little cumbersome, being five years old, and lacks one crucial feature that our users have come to expect: full-text search. The current collection of newspapers has been digitised from microfilm as "page images", which means there is not text version of each page to facilitate search. To address this shortcoming, the Library has launched a project to explore using Optical Character Recognition (OCR) to provide a useful text version. This part of the talk will discuss our experience with OCR, and how DL Consulting Ltd (another NZDL alumni) used the Greenstone Digital Library software to build a new user interface for the collection.


Informing design through observations of convivial family life

Steve Howard
The University of Melbourne, Australia
Friday, 3rd November 2006
Interactive technologies are becoming ubiquitous in many people's lives. However the domestic space in general and families (married couples, young children, grandparents...) in particular are under-represented in research and design literatures of HCI and CSCW. This is despite evidence of the need to address a growing distance between increasingly distributed families.

In this seminar I describe two empirically-based design-oriented projects, 'Mediating Intimacy' and 'Intergenerational Play'. I will describe our approach (using both accepted and novel cultural probes), our findings, and present some early ideas for technology to support conviviality amongst family members.


Service orientation in electronic learning: achievements and lessons

Gottfried Vossen
University of Muenster, Germany
Tuesday, 17th October 2006
E-learning has meanwhile hit the mainstream of education world-wide. E-learning systems typically provide a closed-shop operation for a given community of users. We have contrasted this approach with service orientation, a paradigm that has gained high popularity in software development and application engineering recently. In service orientation, complex applications are decomposed into individual services that provide some fixed functionality. Clients can spot services via public repositories, and can subscribe to them through providers. In electronic learning, service orientation starts from a process-oriented view of e-learning, where the various components of an e-learning system get decomposed into individual functionalities, each of which is the amenable to realization as an atomic or complex service. Our LearnServe prototype has demonstrated to what extent this is feasible. It has also introduced us to a several service issues that are currently under formal investigation.


Filtering and routing in general-purpose publish/subscribe systems

Sven Bittner
Department of Computer Science, The University of Waikato
Tuesday, 26th September 2006
The publish/subscribe communication paradigm can be used in a variety of application scenarios, ranging from low-level monitoring of distributed systems to high-level applications for electronic commerce. Dedicated representatives can be found in news groups, RSS feeds, and Google Alerts.

Currently, most general-purpose publish/subscribe systems only allow for the definition of subscriptions by the help of very restricted conjunctive definition languages. It is typically argued that every (more complex) subscription can be converted to several conjunctive expressions. This statement is generally correct and is successfully applied in the context of database management systems (DBMS). However, in publish/subscribe systems one needs to solve a different problem than in DBMSs: A large amount of subscriptions (continuous queries) has to be constantly evaluated and matched against incoming data (event messages). Therefore, we believe it is questionable whether the applied conversion approach is suitable in the publish/subscribe context.

Within this talk, I will outline the described conversion problem in more detail and give an overview about the main focus of my PhD project. This talk aims at a general audience. I will start by introducing the overall approach of publish/subscribe systems. Then, I will proceed by sketching the research I have undertaken and the findings I have obtained so far. Therefore, within this talk I will rather focus on reporting on general concepts and results than on elaborating on technical details.


Cultural learning practices as a basis for the implementation and design of learning management systems

John Brine
Graduate School for Information Systems, University of Aizu, Japan
Tuesday, 12th September 2006
Understanding cultural learning practices can help to design LMS-based language learning environments. A learning management system (LMS) was implemented in six second-year technical reading classes in a Japanese computer science university. The LMS was used to build and support group work strategies similar to those that Japanese university students are familiar with from their prior public school education. Thus, we attempted to implement an LMS consistent with cultural learning practices. Recognition of the cultural learning practices in the design and implementation of learning management systems is intended to meet student needs and prepare students for more autonomous language development in advanced education and the workplace.


Towards a digital library for language learning

Shaoqun Wu
Department of Computer Science, The University of Waikato
Tuesday, 22nd August 2006
Digital libraries have untapped potential for supporting language teaching and learning. Although the Internet at large is widely used for language education, it has critical disadvantages that can be overcome in a more controlled environment. This article describes a language learning digital library, and a new metadata set that characterizes linguistic features commonly taught in class as well as textual attributes used for selection of suitable exercise material. On the system is built a set of eight learning activities that together offer a classroom and self-study environment with a rich variety of interactive exercises, which are automatically generated from digital library content. The system has been evaluated by usability experts, language teachers, and students.


Geographical data analysis using CCmaps and 3D parallel coordinates plots

Junji Nakano
Institute of Statistical Mathematics and the Graduate University for Advanced Studies in Tokyo, Japan
Friday, 7th July 2006


Digital library components: beyond plugins

Hussein Suleman
Department of Computer Science, University of Cape Town, Rondebosch
Tuesday, 27th June 2006

Components for managing information (often called digital library components) have been with us for some time now, ranging from proprietary extensions for popular software systems to general-purpose tools with arguably clean APIs. In recent years, a number of efforts have attempted to push the boundaries of system designs based on components for managing information. These excurions have ranged from visual interfaces to specification languages, and most recently to mobility and scalability of components and systems.

This talk will present an array of recent projects that built on the Open Digital Libraries initiative, presenting the multi-faceted advantages of components in terms of their support for common problems in system development such as: interface/portal design, system architecture, software configuration management, software distribution, scalability and portability.

The development of the component model over time will highlight how different features support the various use cases where this technology has been applied, and illustrate the utility of such features for future research and production component models and digital library or information management systems.


What I did on study leave: HCI & Digital Libraries at UQ, UCT and UIUC

Dave Nichols
Department of Computer Science, The University of Waikato
Tuesday, 6th June 2006
In 2005 my study leave took me round the world via Brisbane, Cape Town and Urbana-Champaign, Illinois. In this talk I'll describe the research I encountered and how it could influence our work. Including: prototyping with backpackers, yurt raising, smart bridges, web-based interface design and librarians learning about Greenstone.


A conceptual overview of formal methods or FM without the pain of the mathematics

David Streader
Computer Science Department, The University of Waikato
Tuesday, 30th May 2006


Keyphrase indexing with controlled vocabularies: will computers outperform humans?

Olena Medelyan
Freiburg University Hospital, Germany
Tuesday, 23rd May 2006
Keyphrases are widely used in information retrieval as a brief but precise summary of documents. They are usually selected by professional human indexers. The more consistent the indexers are with each other, the higher the retrieval efficiency.

1. We describe an experiment where six professionals assigned keyphrases from a controlled vocabulary to the same documents, and evaluate their indexing consistency. Interesting patterns discovered in this experiment helped in developing an automatic approach for this task.

2. The keyphrase extraction algorithm KEA++ extracts phrases from the documents and maps them onto index terms from a domain-specific thesaurus. A machine learning scheme determines the most significant phrases based on their statistical, syntactic and semantic properties. The evaluation reveals that KEA++ is almost as consistent with the indexers as they with each other.

3. It is important that a keyphrase set covers all main topics of a document. Currently I am improving KEA++ by using lexical chains, which are sequences of semantically related terms that reflect the discourse structure of the text. One of the tasks is to derive an efficient weighting technique to select the most significant chains in a document.

By incorporating more semantic information from controlled vocabularies and thesauri into KEA++ we intend to produce reliable and objectively correct keyphrase sets. In contrast to humans, the algorithm’s indexing will stay consistent over the whole document collection.


The 2DR-tree: a two-dimensional spatial index

Wendy Osborn
Southern Alberta Digital Library, University of Lethbridge, Canada
Tuesday, 16th May 2006
A spatial index is used to search and retrieve objects that exist in multi-dimensional space. Limitations of most spatial indices are their one-dimensional structure and the requirement of data in n-dimensional space to fit this structure. This limitation leads to inefficient searching, within individual nodes and the structure as a whole.

The 2DR-tree overcomes this problem for data in two-dimensional space by fitting the data as given. By using a two-dimensional node structure, all relationships between objects are preserved. This allows different searching strategies (such as binary and greedy) to be supported. In addition, a validity test is employed during the insertion and deletion operations to ensure that all relationships between objects remain intact.

The 2DR-tree insertion, deletion, search and validation strategies will be presented. The results of a performance evaluation and some future research directions will also be presented.


Marsden Fund - lessons learned

Computer Science Staff
Department of Computer Science, The University of Waikato
Tuesday, 10th May 2006
This seminar is planned as a meeting of everyone who was involved in Marsden proposals to exchange experiences. The idea is not to have a big moan about the situation - we can do this in if people want - but to try to gain insight from the responses to the different proposals. If please, at least one person from every proposal could be attending. Further support from people in committees involved would be most welcome.


An overview of the Toilers: an ad hoc networks research group

Tracy Camp
Colorado School of Mines, Colorado
Tuesday, 2nd May 2006
The Toilers are a unique group of staff, graduate students, and undergraduates who research ad hoc networks, specifically wireless sensor networks (WSNs) and mobile ad hoc networks (MANETs). These types of networks are defined by a lack of a fixed infrastructure, multi-hop communication, unreliable wireless links, and decisions made based on local knowledge. Overcoming these challenges presents several open research questions, such as energy-efficient routing, in-network processing, adaptive behavior, and security. Applications of WSNs and MANETs are diverse, and include environmental monitoring, structural monitoring, search and rescue, and tracking.

This talk will present highlights of past successes, current research challenges, and future directions of the Toilers. In the past, the Toilers have invented, implemented, and compared MANET protocols. We have shared simulation code developed by Toiler members with more than 559 researchers at 307 research labs/universities in 43 countries during the previous four years.

Presently, the Toilers are exploring projects which have an interdisciplinary theme. In conjunction with civil, environmental, and electrical engineering, we are developing WSNs to improve the geoconstruction process. Two distinct projects aim to break down the barriers between the network and link layers, and between the link and physical layers.

Looking forward, the Toilers plan to advance theory and practice. For example, we are defining effective practices to improve the confidence in simulation results of ad hoc networks. This talk will present an overview of the Toilers past, present, and future projects, while tying them to the themes of ad hoc network research.


Recent activities in the U.S. to reverse the incredible shrinking pipeline

Tracy Camp
Colorado School of Mines, Colorado
Monday, 1st May 2006
The number of Bachelor degrees awarded in Computer Science in the United States reached an all-time high in 2002-03 (57,439), and the trend of women earning a decreasing percentage of the Bachelor degrees awarded in CS appeared to have subsided. However, recent data suggests that in the near future the number of degrees that will be awarded in CS will plummet, and one alarming prediction is that U.S. universities will graduate less than half of the candidates needed for IT jobs in the U.S. by 2012.

What impact will this abrupt change in CS departments have on the participation of women? Will the incredible shrinking pipeline continue to exist? For that matter, what is the incredible shrinking pipeline, and why does it exist in CS and not other science/ engineering fields? And, finally, does the incredible shrinking pipeline exist outside the United States?

I will answer these questions in this presentation. I will also discuss the recent activities of several organizations in the United States that exist to ensure women participate in IT, and I will detail one successful example of a university that dramatically reversed the incredible shrinking pipeline trend. Lastly, I will give suggestions on what you and your university might consider implementing to increase the participation of both men and women in computing.


Building the Library of Babel: Or, how to digitise the world's most interesting books so no-one can use them

George Buchanan
Future Interaction Technology Laboratory, University of Wales, Swansea
Tuesday, 11th April 2006
Vast resources are now being deployed to digitise volumes of historic material in the universities, museums, archives and libraries of the world. The intended readers of this material are often university academics and amateur historians. The digitisation of documents and creation of searchable indexes is being carried out with little regard to the needs and skills of these target users. The structure of books and volumes is seldom transferred into the digital representation, and mark-up of the content is usually ad-hoc and inconsistent.

Computer science could make a critical contribution to the real success of these projects - be it from human-computer interaction, software engineering, machine learning or information retrieval. Conversely, the digitised material is interesting in itself, but can readily be used to explore key challenges in computer science.

This seminar will give a few examples of the catastrophic problems of current projects, what computer science can offer, and what it can gain.


Socially aware software engineering for the developing world

William Tucker
Department of Computer Science, University of the Western Cape, South Africa
Tuesday, 4th April 2006
While the social effects of Information and Communication Technology (ICT) have received much attention there is very little work on targeted methodologies to develop ICT applications and content in a developing world environment. This talk describes a methodology called Socially Aware Software Engineering we are busy formulating based on firsthand experience building ICT solutions in South Africa. Our method is based on a classical user-centred approach from Human Computer Interaction combined with aspects of Participatory Design and cyclical software engineering practises. These approaches are wrapped into an iterative Action Research paradigm in order to directly include the community-based users of our systems. The talk will outline two cases studies based on our evolving method: a tele-health project in the rural Eastern Cape, and a Deaf Telephony project with a disadvantaged Deaf community in Cape Town, and show how our methodology has emerged from these experiences.


Congestion control

Ian McDonald
Department of Computer Science, The Univesity of Waikato
Tuesday, 28th March 2006
Ian McDonald is currently undertaking a PhD at Waikato in regards to congestion control and has previously had 15 years in the workforce. This seminar will look at congestion control for networks. It will discuss TCP congestion control, the advancements for Linux in this area and research being undertaken in WAND on a variant called TCP Nice. The Datagram Congestion Control Protocol (DCCP) protocol will be discussed and the rationale for it's existence along with details of it's implementation. Ian will then outline his current research into congestion control for real time media applications and will discuss the interaction between the transport layer and the application layer.


What I did on my sabbatical: a computer science travelogue

Ian Witten
Department of Computer Science, The Univesity of Waikato
Tuesday, 21st March 2006
During 2005 I embarked upon a grueling six months of globe-trotting, with extended periods in Italy, France, the US, and Africa. This is an illustrated account of some of the places I saw and some of the things I learned — more a travelogue than a seminar. I will talk briefly about a book I'm writing (with two Italians) on the difficult and contentious issues raised by centralized searching on the web. I'll mention biometric security and the talking face problem. I'll say a little (regrettably very little) about what's happening at Google New York. I'll explain why the International Criminal Tribunal for Rwanda is interested in Greenstone and what toasters have to do with Linux. But mostly I'll show a few pictures. Don't expect anything deep.


The history of CADiZ

Ian Toyn
Department of Computer Science, University of York, UK
Tuesday, 7th March 2006
CADiZ is a set of tools for manipulating Z specifications. It has been developed, on and off, since 1989. This talk will give a broad overview of the features of the CADiZ tool set. Formal methods persons may be interested in the range of features, while others may be interested in how those features are articulated through a simple (unconventional) user interface. Some features will be demonstrated, and hence some Z will appear, but the meaning of that Z will be irrelevant to the points being made.


Model-based testing techniques and application in several industrial areas

Mark Utting
Department of Computer Science, The University of Waikato
Tuesday, 28th February 2006
Model-based testing is a break-through innovation in the field of software testing because it completely automates the validation test process. Model-based testing tools automatically generate test cases from an unambiguous model of the software product, such as a precise UML model. This ensures a repeatable and scientific basis for product testing, ensures coverage of all the behaviors of the product and allows tests to be linked directly to requirements. Intensive research on model-based testing in the last 5-10 years has demonstrated the feasibility of this approach, its cost-effectiveness, and has produced a variety of prototype and commercial tools. This talk will make it possible for software designers, developers and test engineers to clearly understand the basic concepts of model-based testing, its cost effectiveness and how it can be used in large projects.


Data mining in analytical customer relationship management

Johannes Ruhland
Wirtschaftswissenschaftliche Fakultät, Friedrich-Schiller-Universität Jena
Tuesday, 21st February 2006
The talk reports on case studies in the context of analytical Customer Relationship Management (aCRM) providing a test bed for several data mining applications.

One case study is about a charitable organisation, where an exact and differentiated long-term evaluation of its existing base of funding partners is central to its sustained “profitability”. Accurate estimates are achieved through segmentation and Markov modelling.

Another study reports on Germany's largest magazine publisher and its customer base. We have employed segmentation tree algorithms to plan retention and acquisition measures. At present, we are exploring the potential of support vector machines (SVMs). Preliminary results show that only through an upfront Evolutionary Algorithm a sufficiently large improvement over existing methods can be achieved.

We shall also give sketches of further applications.


Events Index