Waikato University crest

Department of
Computer Science
Tari Rorohiko

Computing and Mathematical Sciences

Seminars Presented in 2001

Events Index

Suffix Vector: A Space-Efficient Suffix Tree Representation

Mr Krisztian Monostori
School of Computer Science and Software Engineering, Monash University
Monday, 17 December 2001
Suffix trees are very versatile data structures that are used in many areas of computing including string-matching, DNA-matching, file-compression, etc. One of the main arguments against suffix trees is that they are greedy for space. Numerous attempts have been made in the past to improve the space-efficiency of suffix trees but most of these attempts resulted in representations tailored to specific problems.

The suffix vector representation is a general representation, which can be used in any application area where a suffix tree is applicable. The suffix vector is more space-efficient than any other suffix tree representations that have the same versatility.

Our data structure eliminates the redundancies present in other suffix tree representations, which not only saves space but also results in faster runs of certain algorithm because they do not have to analyse redundant parts of the tree.

Suffix trees and suffix vectors have efficiently been used in our document comparison prototype system: MatchDetectReveal (MDR). A short demonstration of this system will also be presented.

 

Bayes Nets (in Weka)

Dr Remco Bouckaert
Xtal Mountain Information Technology, New Zealand
Tuesday, 4 December 2001
Bayesian networks aka causal networks aka graphical models are a powerful formalism for handling uncertainty with a firm foundation in probability theory. Applications range from among expert systems, speech recognition, economic modeling, encoding/decoding, to Clippy, the irritating help facility in Microsoft Office.

This talk starts with an introduction to Bayes nets and their applications and shows how different network architectures generalize technologies like naive Bayes classifiers, hidden Markov models, and dependency networks.

The main focus of the talk will be on learning Bayes nets from data, a topic that becomes more and more relevant with the increasing streams of data that are generated nowadays. There are various ways of learning Bayes nets: based on conditional independence tests and based on classical Bayesian approaches and their variations. The most important aspects will be highlighted and details of some implementations in Weka pointed out.

 

Autonomous Mobile Robotics

Dr Dale Carnegie
Physics & Electronic Engineering Department, The University of Waikato
Tuesday, 13 November 2001
Since 1993 the Department of Physics and Electronic Engineering has produced a number of mobile robots, many of which are fully autonomous in their functioning. Autonomy, or the removal of human involvement from the machine's functioning, presents numerous challenges specifically with the integration of the mechanics, electronics and software. Our early devices were limited to performing quite simple tasks by the capability of their embedded controllers. Our latest generation of robots contain full Pentium III equivalent computer boards, which permit sophisticated investigation into imaging, artificial intelligence, neural networks, pattern identification, as well as sensor and automation technology.

This seminar will present an anthology of our robotic devices, particularly our mobile security device (MARVIN), the submersible Remote Operating Vehicle, the multi-terrain vehicle, our prototype hexapod walking robot, and our pair of cooperating robots. The expectation is that many opportunities exist for inter-departmental collaboration on a variety of related problems.

 

Situation Awareness in Emergency Ambulance Command and Control: Implications for Human-Systems Interaction

Dr William Wong
Department of Information Science, University of Otago
Tuesday, 6 November 2001
In this seminar, I will describe the nature of situation awareness and its importance to the real-time command and control of emergency ambulances in the London Ambulance Service. One key finding is that dispatchers have to constantly collate and integrate information from different sources, in different modalities, and over a period of time. This information hub strategy places significant cognitive demands on the dispatchers and raises a number of challenges for the design of human-systems interaction for command and control systems.

 

The LIDS Project: The application of Large Interactive Display Surfaces

Professor Mark Apperley, Mr Bill Rogers and Ms Beryl Plimmer
Department of Computer Science, The University of Waikato
Tuesday, 30 October 2001
The LIDS project is concerned with exploring the application of low-cost large interactive display surface technology. White-board like display surfaces have been constructed utilising rear projection with Mimio digitisers to capture pen action. Software is being developed to exploit these displays for teaching, meeting support, and personal information management. The notion of a "white-board metaphor" is being developed.

The seminar will present and demonstrate three aspects of the development: LLC (Lightweight Lecture Capture) which allows presentations using standard PowerPoint to be captured for re-delivery; FreeForm, software to support and capture the early design of forms in Visual Basic; and a package to allow handwritten input of mathematical expressions.

 

Using ideas from biology to solve the visual self-motion estimation problem

Dr John Perrone
Psychology Department, The University of Waikato
Tuesday, 16 October 2001
The successful maneuvering of a person or vehicle through a cluttered environment requires information about possible obstacles (the layout) as well as the instantaneous motion of the observer and craft (heading direction and rotation). The latter 'self-motionz' information is required to help decide if any corrective motor inputs are needed for collision avoidance or for a change in the desired direction of travel. This navigational ability underlies many aspects of human behaviour (walking, running, driving, flying) as well as many machine-based applications (autonomous vehicles, robotics). The two-dimensional image motion that occurs on the back of the eye, seems to be the main source of information used by humans and animals to determine their self-motion through the world. We have developed a model that can account for many of the physiological and psychophysical aspects of visual self-motion estimation. It uses networks of direction- and speed-tuned input motion sensors to form detectors tuned to particular heading and rotation combinations. It can estimate heading from digital image sequences derived from a single moving camera and can generate a surface map of the scene immediately ahead. I will discuss the construction of this model as well as our attempts to build computer models of the 2-D motion sensors that make up the input stage. The heading detectors and the 2-D input sensors are based on the properties of motion sensitive neurons found in the primate brain.

 

(The Futility of) Trying to Predict Carcinogenicity of Chemical Compounds

Dr Bernhard Pfahringer
Department of Computer Science, The University of Waikato
Tuesday, 2 October 2001
The Predictive Toxicology Challenge 2001, an attempt to predict carcinogenicity purely from structural information of chemical compounds, took place in conjunction with the European Conference on Machine Learning in Freiburg, Germany, recently. In this talk I will give some background on the task as well as describe my submission to one of the sub-problems posted for the challenge. The WEKA Machine Learning workbench served as the core learning utility. Based on a preliminary examination of my submission we can conclude that reliable prediction of carcinogenicity is - unfortunately - still a far away goal.

 

IPMP: The IP (Internet Protocol) Measurement Protocol

Dr Tony McGregor
Department of Computer Science, The University of Waikato
Tuesday, 18 September 2001
Active measurement is an important measurement technique. Unfortunately, the protocols currently used to make active measurements have mostly not been designed for measurement. This seminar will introduce IPMP, the IP Measurement Protocol, which we have designed to support active measurement and to address some of the limitations of existing protocols. IPMP supports the measurement of network path and delay in a single packet exchange, the exchange of information about clocks that allows one-way delay measurements to be made without an external time source, a router friendly packet format, allowing delay measurement to routers and DoS protection. The protocol also provides support for easy kernel level timestamps. IPMP has been implemented in the FreeBSD kernel and Linux kernel and is currently in use in the NLANR AMP system, where it is being used to measure delay on approximately 15,000 paths, mostly on the vBNS and Abilene networks. This seminar introduces IPMP and explains its key features and how it provides these features. The seminar will begin with a light hearted tutorial review of Internet routing and existing active measurement techniques.

 

On the Utility of Global Representations in a Cognitive Map

Dr Margaret Jefferies
Department of Computer Science, The University of Waikato
Tuesday, 11 September 2001
There are two dominant theories for how an agent should represent the spatial map (termed a cognitive map) constructed for the places it visits. In one, a single global coordinate system is used to represent the whole traversed environment, while in the other each individual space visited is represented by its own local coordinate system. In the latter the individual spaces are connected in the way they are experienced to form a topological map of the agent's environment. Global representations are popular because using them the agent can detect it is re-entering a part of the environment from its location in the global cognitive map. However complex error correction procedures are required so that the agent's location in the physical environment remains aligned with its location in its cognitive map. Errors are less likely to accumulate in topological representations but the agent cannot easily recognise that it is revisiting a place it has been to before if it is approached from a different side. In this seminar I describe an approach to mapping the agent's spatial environment which combines a global and topological map exploiting the advantages of each representation. I show how a small global memory which provides a window into the most recently visited places stored in the topological map can be used in the recognition process. I also relate this work to similar processes found in the path integration system of animals.

 

Not Globalisation: against software hegemony

Dr Bob Barbour and Colleagues
Department of Computer Science, The University of Waikato
Tuesday, 4 September 2001
Internationalisation of software products is still in the 'very hard' basket. Enormous effort is currently being put into establishing a boundary between an internationalised application and its localised culturally dependant separate component - locally produced.

Observation of current practices has led to disturbing realisations that something is fundamentally wrong. (Alvin Yeo talks about issues arising from his DPhil research.)

Software engineering theory takes for granted the interchangeability of user interfaces at the sounds and marks representation level. Failure to address issues at this level has led to increasing technological imperialism. Software practice at an international level has enshrined these outcomes in a methodology referred to as i18N and L10N. (Bob Barbour talks about the relationship between models, software, cultures and people.)

Keith Hopper identifies relationships among the aspects of computer science and discusses possible solutions. Solutions require both further research, the incorporation of appropriate pedagogy together with tools and techniques at all levels of software engineering education.

 

The relationship between Hidden Markov Models and PPM compression

Stuart Yeates
Department of Computer Science, The University of Waikato
Wednesday, 29 August 2001
We have been using PPM compression models to solve the generalised tag insertion problem for some time, but only just started to look at the relationship between PPM and HMM.

Establishing the relationship between PPM models and HMMs will potentially allow us to leverage 30 years of HMM research in our work on PPM models. In particular we look at re-estimation and entropy based node merging and splitting.

 

A Simple Approach to Ordinal Classification

Dr Mark Hall
Department of Computer Science, The University of Waikato
Tuesday, 28 August 2001
Machine learning methods for classification problems commonly assume that the class values are unordered. However, in many practical applications the class values do exhibit a natural order—for example, when learning how to grade. The standard approach to ordinal classification converts the class value into a numeric quantity and applies a regression learner to the transformed data, translating the output back into a discrete class value in a post-processing step. A disadvantage of this method is that it can only be applied in conjunction with a regression scheme.

This talk presents a simple method that enables standard classification algorithms to make use of ordering information in class attributes. By applying it in conjunction with a decision tree learner we show that it outperforms the naive approach, which treats the class values as an unordered set. Compared to special-purpose algorithms for ordinal classification our method has the advantage that it can be applied without any modification to the underlying learning scheme.

 

Knowledge and Data in Computational Biological Discovery

Dr Pat Langley
Institute for the Study of Learning and Expertise, Palo Alto, California, and Stanford University
Thursday, 23 August 2001
The growing amount of biological data has led to the increased use of computational discovery methods to understand and interpret these data. However, most work has relied on knowledge-lean techniques like clustering and classification learning, and it has relied on formalisms developed in AI or statistics, so that the results seldom make direct contact with current theories of biological processes. In this talk, I describe an approach to computational discovery that incorporates knowledge in the form of an existing process model, utilizes data to refine this model, and casts the result in terms familiar to biologists. I illustrate this approach to biological knowledge discovery in two domains. One involves improving a quantitative model of the Earth's ecosystem using environmental data from satellites and ground stations. The other effort focuses on constructing metabolic and regulatory models for simple organisms using temporal data from DNA microarrays. Initial results suggest that this method of combining data with knowledge can improve predictive ability while ensuring that the revised models remain communicable to human biologists.

This talk describes joint work with V. Brooks, J. Cross, T. Grenager, S. Klooster, A. Pohorille, C. Potter, S. Sage, K. Saito, M. Schwabacher, J. Shrager, and A. Torregrosa.

 

Collaborative and Multi-Paradigm Programming for Children

Tim Wright
Department of Computer Science, University of Canterbury
Monday, 20 August 2001
There has been much research into programming environments for children, with Computer Scientists building many varied environments to support programming. There has also been much research into how to build environments to support collaborative learning. Despite these technical achievements, there has been a lack of investigation into the fundamental activities of programming. Similarly, there has been a surprising lack of empirical evaluation of computer supported collaboration on learning.

This is a talk of two halves. First we present a task based analysis of learner programmer environments. The decomposition shows us that many environments do not support all of the fundamental activities of programming well. Second we report on an empirical evaluation of computer supported collaborative learning. The evaluation found that collaboration can affect performance, but did not significantly effect learning. We will also talk about our on-going and future-work that integrates these areas.

 

Architectural Support for Synchronization

Professor James R. Goodman
Department of Computer Science, University of Wisconsin - Madison
Friday, 17 August 2001
The challenge for the computer architect is to turn ever more transistors into better computers. But architectural innovation can sometimes make programming easier as well as improving performance. I will describe some recent developments that improve the performance of parallel applications by exploiting hardware already largely present in modern processors. Surprisingly, these methods reduce the burden on the programmer, making it easier to write correct, efficient parallel programs.

 

Search behavior in a research-oriented digital library

Dr Sally Jo Cunningham
Department of Computer Science, The University of Waikato
Tuesday, 14 August 2001
This talk presents results from a transaction log analysis of ResearchIndex, a digital library for computer science researchers. ResearchIndex is an important information resource for members of this target group, and the collection sees significant use worldwide. Queries from over six months of usage were analyzed, to determine patterns in query construction and search session behavior. Where appropriate, these results are compared to earlier studies of search behavior in two other computing digital libraries.

 

Happy Birthday NZDL

Dr David Bainbridge and Colleagues
Department of Computer Science, The University of Waikato
Tuesday, 31 July 2001
To mark the New Zealand Digital Library Project's sixth birthday, the group will give a presentation reviewing its past, present and future. Best known through its open source, freely available Greenstone software (see www.nzdl.org for more details) the group's strategy is to back up research—where possible—with software tools that demonstrate in tangible ways the usefulness of the research. Work spans numerous fields including compression and indexing, computer supported collaborative work, data and text mining, distributed protocols, ethnography, human computer interaction, image processing, information retrieval, machine learning, and musicology. As part of the talk sample work drawn from these areas will be presented.

 

Music Information Retrieval: Thoughts about Future Directions

Dr J. Stephen Downie
Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Wednesday, 25 July 2001
Interest in Music Information Retrieval (MIR) has been exploding over the last three years. During this time I have had the great fortune to have come in contact with many of the research teams currently exploring MIR research, development and evaluation. These teams represent backgrounds ranging from musicology, law, and librarianship to audio engineering, computer science and neural sciences. Each team brings to the MIR problem space its own paradigms, goals and research values. This talk will informally explore this wide range of viewpoints with an eye toward discerning what the future of MIR research might look like both the near and long terms.

 

Some classification problems in forensic science

Dr James M. Curran
Department of Statistics, The University of Waikato
Tuesday, 24 July 2001
In recent times there has been considerable public attention focused on forensic evidence. Where once everyone expected finger print evidence, now expectation is that DNA evidence will be found. When forensic evidence is found and taken to trial, the questions of relevance and weight arise. That is, "Is the evidence relevant to the case?" and "If it is relevant, is it strong evidence or weak evidence". To answer these questions experts including statisticians are often asked to testify.

In this talk I will discuss some of the case work I have been involved in, and some problems that I think are of particular interest to computer scientists.

Warning: There will be a small number of extremely graphic photographs presented. These may disturb some people.

 

An analysis of Operation Refinement for Z Specifications

Moshe Deutsch
Department of Computer Science, University of Essex
Tuesday, 17 July 2001
Operation Refinement, underlying Data Refinement for Z specifications lays a substantial foundation to program development, using the Transformational Software Process Model. We investigate various different approaches to operation refinement, using the logic of Z mathematical framework, due to Henson and Reeves. This includes investigation of the relationship between proof-theoretic and model-theoretic refinement theories, in addition to a profound investigation, which is motivated by monotonicity properties of various different refinement theories for Z.

 

A gentle introduction to refinement

Martin Henson
Department of Computer Science, University of Essex
Tuesday, 10 July 2001
Refinement is a relationship between an abstract specification and a more concrete specification, leading ultimately to something concrete enough to be executed. There are various ways in which refinement can be precisely characterised, all of which are more or less intuitive. I will review (some of) these intuitions and indicate how the different notions of refinement they lead to are, in fact, equivalent.

 

Active Directory and the SCMS Windows 2000 Project

Harry Johnston, Dax Bunce and Roger Thomas
SCMS TSG, University of Waikato
Tuesday, 3 July 2001
This seminar will give an overview of the SCMS Windows 2000 migration project – where we started from, where we are, and where we are going – as well as a brief overview of the Active Directory technology itself.

 

The Z-lambda project

Dr Steve Reeves
Department of Computer Science, The University of Waikato
Tuesday, 26 June 2001
In this overview seminar, I will be talking about the work of the Z-lambda project, based in the Formal Methods lab here in the University of Waikato. This project has three main aspects: the use of micro-charts for modelling reactive systems; the extension of Z to include refinement, so that Z models can be transformed, while preserving correctness, to implementation; the reflection in the micro-charts language of these extensions to Z (via the Z semantics for micro-charts that we have developed). The talk will describe a mixture of work already completed, our current work and our plans for the future.

 

Mobile Internet Usability

George Buchanan
School of Computing Science, Middlesex University
Tuesday, 5 June 2001
Recent years have seen a rapid increase in the use of small-screen devices, and recently it has been suggested that it is possible to access Internet services, particularly the Web, using even the very small screens of mobile phones.

At Middlesex University, we have undertaken a number of studies of the behaviour of users when faced by the challenge of using information systems on small displays. Other groups have studied the same problem, starting in the mid-1980s.

The history and outcomes of earlier work and our latest studies show a developing pattern in the challenges, problems and successes of small-screen usability, especially in the current context of the internet.

 

Pace regression: An illustrative introduction

Dr Yong Wang
Department of Computer Science, University of Waikato
Tuesday, 29 May 2001
Pace regression is a new approach to fitting linear models, proposed in my just finished PhD project. It addresses how to estimate parameter values for the purpose of optimal prediction, while the model dimensionality is determined as a natural by-product. In this talk, I will give an illustrative introduction to pace regression, focusing on its basic ideas behind mathematical details; present some experimental results to give an indication how it performs in practice; and discuss briefly its general implications for empirical modeling.

 

Customised Information Delivery: The Meeting of Information Retrieval, Virtual Documents and Discourse Modelling

Dr Francois Paradis
CSIRO Mathematical and Information Sciences, Melbourne, Australia
Thursday, 24 May 2001
One of the strengths of electronic information lies in the ability to adapt contents according to a user or context. There are numerous examples of such customisation on the Web in the form of virtual documents and adaptive hypertexts. In this talk I will discuss two projects where we take a multi-disciplinary approach to customisation, namely from the fields of information retrieval, virtual documents and discourse modelling.

In the first project, Taylor, our aim is to use document structure to deliver passages (rather than document pointers) in answer to a query, and combine these passages to form a coherent answer. A prototype has been built around this idea, and some of our industrial partners have started to show interest in this technology (notably, Lonely Planet - the travel guide company). I will make a brief demonstration of Taylor with a collection of Java-related documents.

The second project, Tiddler, has the more ambitious aim to build answers using discourse models and user profiles. There are several research questions relevant to this problem: how to model users (acquisition, scrutability of the model, etc.), how will discourse improve delivery, how to evaluate, etc. We are now finishing a first implementation and hope to conduct experiments soon.

 

Spatial Hypertext for Digital Libraries

George Buchanan
School of Computing Science, Middlesex University
Tuesday, 22 May 2001
Spatial Hypertext was introduced in the mid-1980s as a highly user-centred medium for creating organised structures for texts. As such, they would seem good candidates as reader tools in digital libraries.

However, the classic examples of Spatial Hypertext such as VIKI have been at best loosely connected to information systems such as databases or the internet, and also have required little or no user feedback. Therefore, they can not be adopted blindly into a digital library workflow, nor into a digital libary infrastructure.

 

Mondrian, An Experimental Functional Language For OO Environments

Dr Nigel Perry
IIST, Massey University
Thursday, 17 May 2001
The talk will give an overview of Mondrian, a functional language designed to interwork well in OO envrionments such as JVM and .NET. Mondrian uses classes and subtyping for data types, rather than the traditional sum types used in functional languages, to fit in the OO model. From functional languages it provides parametric polymorphism, non-strict evaluation, monadic I/O etc. Mondrian code can be called directly from OO languages such as Java and C#. Brief mention will also be made of our work on Haskell and future directions.

 

Learning structure from sequences, with applications in a digital library

Professor Ian H. Witten
Department of Computer Science, The University of Waikato
Tuesday, 15 May 2001
The services that digital libraries can provide to users can be greatly enhanced by automatically gleaning certain kinds of information from the full text of the documents they contain. This seminar will review work done by the Machine Learning and Digital Libraries groups at Waikato over the past few years that use novel techniques of machine learning (broadly interpreted) to extract information from plain text. Three areas of research will be described: hierarchical phrase browsing, including efficient methods for inferring a phrase hierarchy from a large corpus of text; text mining using adaptive compression techniques, including a new approach to word segmentation, generic entity extraction, and acronym extraction; and keyphrase extraction and its application in a digital library.

 

.africa - avoiding the potholes

Dr Gary Marsden
Department of Computer Science, University of Cape Town
Tuesday, 8 May 2001
Africa has always been a continent with a huge, but unrealised, potential. African leaders, however, are touting this as the "African Century" and are looking to Information Technology to bring about an African Renaissance. Within the CS department at University of Cape Town, we are investigating the human and technological problems which stand in the way of this vision. In this talk, we will present some of the work we have carried out and some of the unexpected results which lead us to believe the digital divide may be a little shallower in places.

 

Human Evaluation of Kea, an Automatic Keyphrasing System

Dr Steve Jones
Department of Computer Science, The University of Waikato
Tuesday, 1 May 2001
This paper describes an evaluation of the Kea automatic keyphrase extraction algorithm. Tools that automatically identify keyphrases are desirable because document keyphrases have numerous applications in digital library systems, but are costly and time consuming to manually assign. Keyphrase extraction algorithms are usually evaluated by comparison to author-specified keywords, but this methodology has several well-known shortcomings. The results presented in this paper are based on subjective evaluations of the quality and appropriateness of keyphrases by human assessors, and make a number of contributions. First, they validate previous evaluations of Kea that rely on author keywords. Second, they show Kea's performance is comparable to that of similar systems that have been evaluated by human assessors. Finally, they justify the use of author keyphrases as a performance metric by showing that authors generally choose good keywords.

 

CustomObjects: a model-oriented end-user programming environment

Dr Robert Aish
Bentley Systems
Wednesday, 11 April 2001
Computer based design tools are intended to be open-ended systems with which innovative designers can construct geometric and engineering models of new artifacts and environments. The user interface to these design tools are often based on preconceptions about the tasks and work flow of different specific design and engineering disciplines and consequently may only exposes a fraction of the geometric and computational possibilities inherent in the underlying foundation software. This more extensive range of possibilities is only available to those users who are prepared to, and have the necessary skills to, create their own end-user application software. However, the resource constraint imposed within end-user organisations usually requires that this type of software can only be developed under intense time pressure in the course of normal engineering projects. So the functionality and the usability of the end-user programming tools is a critical factor for unlocking the full potential and value of the total system. Given that the creative design process is by its very nature "extensible", it seems more appropriate to consider a computer based design tool as a (visual) programming system embedded within a high performance geometric modelling and visualisation framework.

This presentation explores some of the design issues involved in the development of a model-oriented end-user programming environment, and describes such a system called Custom Objects. This system combines direct interactive manipulation design methods based on based feature modelling and constraints with visual and traditional programming techniques.

 

Using feedback to improve Optical Music Recognition

John McPherson
Department of Computer Science, The University of Waikato
Tuesday, 10 April 2001
While the steps taken in automatically processing and recognising sheet music are well understood, they are normally in a discrete, fixed sequence. However, the later stages of the process reveal information that could be used to improve the accuracy of the earlier steps. This seminar gives a brief introduction to optical music recognition, and discusses how feedback might be used to improve the quality of the results.

 

Memory Hierarchies as a Metaphor for Academic Library Collections

Stuart Yeates
Department of Computer Science, The University of Waikato
Tuesday, 10 April 2001
Research libraries and their collections are a cornerstone of the academic tradition, representing 2000 years of development of the Western Civilization; they make written history widely accessible at low cost. Computer memories are a range of physical devices used for storing digital information that have undergone much formal study over 40 years and are well understood. I draw parallels between the organisation of research collections and computer memories, in particular examining their hierarchical structure, and examines the implication for digital libraries.

 

Mining Dates from Historical Documents

Dana McKay
Department of Computer Science, The University of Waikato
Tuesday, 10 April 2001
The essential quality of information n a digital library is accessibility. Full text search is not enough for some collections, more can be done. Historical collections, for example, contain dates and it would be useful to historians to be able to search by them. However, these dates may occur anywhere within the text of historical documents, and to be searchd they must be extracted from the documents and integrated into the collection index. Doing this manually is very expensive, and described here is a system to do it automatically. This system was implemented within the Greenstone framework used by the New Zealand Digital Library, and involved the use of some carefully designed heuristics.

 


Events Index