Waikato University crest

Department of
Computer Science
Tari Rorohiko

Computing and Mathematical Sciences

Seminars Presented in 2002

Events Index

Colliding Fronts: Using XML and the TEI to build a full Text Digital Library at the New Zealand Electronic Text Centre

Ms Elizabeth Styron
New Zealand Electronic Text Centre, Victoria University
Tuesday, 17 December 2002
The New Zealand Electronic Text Centre (NZETC) at Victoria University of Wellington enjoys the benefits of a diverse academic community and skill set to help it build an ever growing online collection of New Zealand materials. To deliver its full-text, fully searchable documents in multiple formats, the NZETC combines the power of XML with rich metadata guidelines of the Text Encoding Initiative (TEI). The University of Virginia Etext Center which helped found the NZETC mastered the process of delivering full texts online independent of proprietary software and with a clear mind toward repurposing the materials in the future in accordance with emerging technologies. The New Zealand Electronic Text Centre continues that tradition with a heightened focus on taking advantage of the flexibility of XML and TEI and with greater emphasis on building a regional collection. About Elizabeth Styron: Elizabeth graduated from Duke University, with a degree in English which included studies in computer programming. She has her Masters of Fine Arts from the University of Virginia where she was an assistant director and programmer analyst at the University of Virginia Electronic Text Center. She managed XML compliance checking for the Center in 1999 and went on to oversee several XML-based humanities computing projects. As the National Endowment for the Humanities Fellow at the Center, Elizabeth managed XML work for the 600 volume Early American Fiction digitisation project (available through ProQuest). After helping to convert and launch Virginia's free E-book collection (7 million books downloaded to date), Elizabeth travelled to New Zealand as a 2001 United States Fulbright Fellow. Last year she founded the NZETC at Victoria University with a mission to digitise and generate full-text XML versions of New Zealand archival materials for multiple-format delivery on the internet. The NZETC is closely involved with several institutions in New Zealand to deal with their conversion and digitisation concerns, and Elizabeth lectures frequently on digital library activities here and abroad.

 

Reo Maori kei te Ipurangi 2002 - Findings of a recent Maori Language Web Survey

Mr Te Taka Keegan
Department of Computer Science, The University of Waikato
Thursday, 12 December 2002
Recent statistics state that the percentage of people on-line in non English is as high as 63.5%. In New Zealand more than half (52.7%) of the population are currently accessing the Internet. What then, does the Internet, or more specifically the World Wide Web, have available that's in the Maori language? In this seminar I will present findings from an extensive search and analysis of web sites and pages in the Maori language, including number of sites and pages, purposes and themes of sites, levels of language support, who is making an outstanding contribution, and who isn't but maybe should be.

 

Spatial Memory in 2D and 3D Pysical and Virtual Environments

Dr Andy Cockburn
Department of Computer Science, University of Canterbury
Tuesday, 3 December 2002
User interfaces can improve task performance by exploiting the powerful human capabilities for spatial cognition. This opportunity has been demonstrated by many prior experiments. It is tempting to believe that providing greater spatial flexibility by moving from flat 2D to 3D user interfaces will further enhance user performance. This seminar describes a series of experiments that investigate the effectiveness of spatial memory in real-world physical models and in equivalent computer-based virtual systems. The different models vary the user's freedom to use depth and perspective in spatial arrangements of images representing web pages. Results show that the subjects' performance deteriorated in both the physical and virtual systems as their freedom to locate items in the third dimension increased. Subjective measures reinforce the performance measures, indicating that users found interfaces with higher dimensions more 'cluttered' and less efficient.

 

Grounding Linguistic Structure in a model of Sensorimotor Cognition

Dr Alistair Knott
Department of Computer Science, University of Otago
Friday, 22 November 2002
In this talk, I'll explore the idea that the syntactic structure of a sentence can be thought of as a description of the perceptual and motor processes through which a human agent interacts with the world. According to this idea, for instance, the syntactic structure of the sentence 'There is a cat in the garden' can be derived from a psychological account of the perceptual process by which an observer comes to learn that there is a cat in the garden. There is considerable psychological evidence to suggest that a perceptual process such as this one must be decomposed into a number of independent and interacting sub-processes of attention-direction and categorisation: I will argue that this decomposition can be related very directly to the compositional structure of sentences, particularly if we adopt a conception of phrase structure broadly within the tradition of generative grammar.

 

Greenstone in Practice

Professor Ian Witten
Department of Computer Science, The University of Waikato
Wednesday, 20 November 2002
Greenstone has been used to make many digital library collections. Some were created within the NZDL as demonstration collections. However, the use of Greenstone internationally is growing rapidly, and several web sites show collections created by external users. Most contain unusual and interesting material, and present it in novel and imaginative ways. This paper briefly reviews a selection of Greenstone digital library sites to give a feeling for what public digital libraries are being used for. Examples are given from different countries: China, Germany, India, Russia, UK, US; different kinds of library: historical, educational, cultural, research; different sorts of source material: text, document images, pictures.

 

Formal Object-oriented Specification in Standard Z

Mr Shaochun Wang
Department of Computer Science, The Universtiy of Waikato
Tuesday, 19 November 2002
Z is one of the more popular formal specification languages that are necessary in achieving correct software. A formal specification in Z is unambiguous and analysable, and can be proven to fulfil its requirements. Object-oriented techniques are successful in the production of large, complex software systems and offer a conceptual consistency across all stages of software development. We developed an approach to specify object type as a given set and inheritance as object type subsetting, so that we can formally specify an object-oriented design in standard Z. This seminar will explain the philosophy and style of how to form a formal object-oriented specification in standard Z.

 

Complete Refinement Rules for Microcharts

Mr Greg Reeve
Department of Computer Science, The University of Waikato
Tuesday, 12 November 2002
mCharts (MicroCharts) is a specification language for reactive systems. Specifications are based on finite state automata with the addition of hierarchical and compositional structuring mechanisms. A notion of refinement exists for mCharts along with rules for calculating possible refinements. Although these rules are sound, i.e. any refinement constructed by the rules is a valid refinement, they are not complete, i.e. given two arbitrary mCharts specifications it is not, in general, possible to show whether or not one is a refinement of the other. This seminar will present work on constructing a complete calculus for the refinement of mCharts.

 

Groups Acting on Sets - A Powerful Approach to Content Based Retrieval

Mr Andreas Ribbrock
Department of Computer Science, University of Bonn
Tuesday, 5 November 2002
When using string-matching approaches to search a database of polyphonic scores one encounters the problem of how to model polyphony with strings. As there is no canonical way of doing this, we propose a different approach to overcome this problem. Instead of using strings we model a polyphonic score as a set of notes. To search for a query set of notes in a large database, we developed an indexing technology based on the theory of groups acting on sets and inverted lists. We found that this approach might also be used for searching in large CD-audio or image databases. After giving an introduction to the basics of our indexing technology, a few software tools will be presented. First, we developed a tool for searching a database of more than 12.000 polyphonic scores. Second, a query-by-whistling application has been developed to find out which kinds of fault tolerance could be added to our indexing technology to cope with whistled input. The third tool is a prototypic implementation of a monitoring application. Incoming audio data is sampled and segments of a few seconds are used as query inputs. The software is able to tell the user in real time what song is currently being played and when it has been played before. I will close my talk presenting some pictures from Bonn, the former capital of Germany.

 

Measurement and Analysis of One-Way Internet Packet Dynamics

Mr Matthew Luckie
Department of Computer Science, The University of Waikato
Tuesday, 29 October 2002
I present the IP Measurement Protocol (IPMP), designed by Tony McGregor and implemented by myself. The protocol is powerful in that it allows a large amount of information about the path to be collected in a single packet exchange. This seminar will discuss two unique applications of the protocol. The first is the use of the protocol with imprecise clocks to conduct one-way delay measurements. The second is the use of the protocol to conduct bandwidth estimation on networks such as CRCnet where routers have the protocol deployed.

 

FreeFrom: An Interface Design Environment for Novice Programmers

Mrs Beryl Plimmer
Department of Computer Science, The University of Waikato
Tuesday, 22 October 2002
Most people have difficulty learning to program because it requires a complex interplay of knowledge and skills. To create a program one must understand the problem, be able to identify the underlying data and processes and know a programming language. Experts are able to visualise a solution and quickly create a sketch that subdivides the solution space into logical (usually hierarchical) parts. Students often attack problems in a haphazard way and in the process get sidetracked and frustrated. Small groups informally prototyping the user interface by hand drawing it and then using scenarios to 'play computer' is one way for students to get a better understanding of the problem before they start to code. The FreeForm environment that I have developed allows students to hand-sketch a user interfaces on a digital whiteboard and check them out in sketch mode. The environment provides the usual editing functions of a drawing package and can convert the sketch to a visual basic form. A recent evaluation of the environment indicates that students enjoyed using it and have a better understanding of the problem and program requirements than if they simply work on an ordinary whiteboard.

 

mChart-Based Specification and Refinement

Dr Steve Reeves
Department of Computer Science, The University of Waikato
Tuesday, 15 October 2002
Two new notions of refinement for mCharts are introduced and compared with the existing notion due to Scholz. The two notions are interesting and important because one gives rise (via a logic) to a calculus for constructing refinements and the other gives rise (via model checking) to a way of checking that refinements hold. Thus we bring together the two competing worlds of model checking and proof.

 

Results on Formal Stepwise Design in Z

Dr Steve Reeves
Department of Computer Science, The University of Waikato
Tuesday, 15 October 2002
Stepwise design involves the process of deriving a concrete model of a software system from a given abstract one. This process is sometimes known as refinement. There are numerous refinement theories proposed in the literature, each of which stipulates the nature of the relationship between an abstract specification and its concrete counterpart. This seminar considers six refinement theories in Z that have been proposed by various people over the years. However, no systematic investigation of these theories, or results on the relationships between them, have been presented or published before. These six theories fall into two important categories, and we prove that the theories in each category are equivalent.

 

Trust Management : The SULTAN Perspective

Dr Tyrone Grandison
Department of Computing, Imperial College
Tuesday, 8 October 2002
The internet is now being used for commercial, social and educational interactions, which previously relied on direct face-to-face contact in order to establish trust relationships. Thus, there is a need to be able to establish and evaluate trust relationship relying only on electronic interactions over the Internet. A trust framework for Internet applications should corporate concepts such as experience, reputation and trusting propensity in order to specify and evaluate trust. SULTAN (Simple Universal Logic-oriented Trust Analysis Notation) is an abstract, logic-oriented notation designed to facilitate the specification and analysis of trust relationships. This seminar will provide a glimpse of SULTAN Trust Management, which not only includes facilities for trust specification, trust analysis, but also trust monitoring and risk management.

 

Multiparadigm Programming in J/MP

Dr Timothy A. Budd
Computer Science Department, Oregon State University
Friday, 20 September 2002
The advocates for logic programming, functional programming, and object-oriented programming have each in the past several years made convincing arguments as to the benefits of their style of software development. The basic idea of multiparadigm programming is to provide a framework in which these benefits can each be realized, and in which each of the different paradigms draws power from features provided by the others. In this talk I will introduce the basic ideas of multiparadigm programming, using the programming language J/MP that my students and I have developed. I will illustrate how programming features from each of the different paradigms I have named can be integrated together in programs designed to address a number of common programming problems.

 

Forming a corpus of voice queries for music information retrieval

Mr John McPherson
Department of Computer Science, The University of Waikato
Tuesday, 10 September 2002
The use of audio queries for searching multimedia content has increased rapidly with the rise of music information retrieval; there are now many Internet-accessible systems that take audio queries as input. However, testing the robustness of such a system can be a large part of the development process.

We propose the creation of a corpus of audio queries that is designed with the needs of music information retrieval researchers and practitioners in mind.

 

Sorting out Searching on Small Screen Devices

Dr Matt Jones
Department of Computer Science, The University of Waikato
Tuesday, 10 September 2002
Small handheld devices - mobile phones, PDAs etc - are increasingly being used to access the Web. Search engines are the most used Web services and are an important user support. Recently, Google(TM) (and other search engine providers) have started to offer their services on the small screen. This paper presents a detailed evaluation of how easy to use such services are in these new contexts. An experiment was carried out to compare users' abilities to complete realistic tourist orientated search tasks using a WAP, PDA-sized and conventional, desktop interface to the full Google(TM) index. With all three interfaces, when users succeed in completing a task, they do so quickly (within 2 to 3 minutes) and using few interactions with the search engine. When they fail, though, they fail badly. The paper examines the causes of failures in small screen searching and proposes guidelines for improving these interfaces.

 

Optimising Tabling Structures for Bottom-Up Logic Programming

Professor John Cleary
Department of Computer Science, The University of Waikato
Thursday, 5 September 2002
In this presentation we show how efficient data structures can be automatically introduced into programs that are written using flat database like relations. We describe a compilation process for Starlog, which is a pure logic programming language with temporal features that allow updates to be supported cleanly. Starlog programs are translated into an intermediate language (SDSL), which is optimised, and then analysed to determine which data structures are best for each relation. Then Java code is produced, using a library of possible data structures with known speed characteristics. The main benefit of this approach is that different data structures can be introduced without changing the source program. This makes programs more flexible and easily maintained. The associated paper has been accepted for LOPSTR 2002 (International Workshop on Logic Based Program Development and Transformation) and will be presented this month in Madrid.

 

Mercy Corps

Mr Jarred Potter
Mercy Corps
Tuesday, 3 September 2002
Mercy Corps is an international relief and development organization that exists to alleviate suffering, poverty, and oppression by helping people build secure, productive, and just communities. The agency now operates in more than 30 countries reaching 5 million people worldwide. Since 1979, Mercy Corps has provided more than $640 million in assistance in 74 nations. Mercy Corps is known nationally and internationally for its quick-response, high-impact programs. Over 90 percent of the agency's resources are allocated directly to programs that help those in need. Mercy Corps pursues its mission through:
  • emergency relief services that assist people afflicted by conflict or disaster.
  • sustainable community development that integrates agriculture, health, housing and infrastructure, economic development, education and environment, and local management.
  • civil society initiatives that promote citizen participation, accountability, conflict management, and the rule of law.
In an effort to increase productivity worldwide, Mercy Corps began the task of building a digital library. After choosing the Greenstone Digital Library Software, Mercy Corps initiated development on a document submission system for the field including an interface for a librarian to assign metadata to documents. The long-term goal is to create a system that is freely distributable that will ease the creation of collections for other organizations. This talk will begin by presenting some information about Mercy Corps's activities, for general interest. Then I will discuss the need for a digital library, the information that it contains, and how it is used. Finally I will demonstrate our 'metadata creator' software, through which users in field offices submit documents to the library, and show how a central librarian vets the documents before they are placed into the library.

 

Cyber Education Across Several Countries

Professor Nobuo Saito
Faculty of Environmental Information, Keio University
Monday, 26 August 2002
It is required to make a collaboration in human resource development over various countries and regions in Pacific-Asian area. Keio University Shounan Fujisawa campus are trying to extend the mutual efforts among oversea universities in various countries in Asian area. There are several projects in this activities: Keio University, Yonsei University(Korea) and Shanghai Jaoton University(China) are now planning to do the experimental distance learning for IT and Global Governance courses. Keio University and Chulalongkorn University (Thailand) with the support by Hitachi Ltd. did the experimental distance learning for three courses in IT area. Keio University, Hanoi Institute of Technology and Vietnam National University in Hanoi(Vietnam) are planning to have a distance learning experiment in near future for the IT courses and other social science courses. We also give the Web-based training course for the pre-education of students who will come to Japanese Universities from Malaysia. All these activities will be expected to merge to one joint project so that there is one network for distance learning among these countries. Distance learning in different culture gives us the deep insight for the effectiveness of distance learning. Keio University itself is also much interested in the use of cyber education, and it is planning to start a center for e-learning, and it extends the e-learning classes in a lot of faculties and graduate schools.

 

Improving Mobile IP Handovers

Dr Richard Nelson
Centre for Telecommunications and Information Engineering, Monash University
Tuesday, 20 August 2002
Currently many fixed networks are rapidly moving to an all IP infrastructure for data, voice and video/multimedia services. Mobile telephony networks have grown rapidly in popularity yet current and next generation standards do not use all IP architecture. Similarly WLAN access is now very popular and capable of supporting advanced IP based services at broadband speeds, but mobility in such systems is restricted in coverage due to the use of Layer 2 techniques. All IP based mobile networks would have particular advantages in being able to migrate between network technologies and operators (e.g. W-CDMA - WLAN - fixed Ethernet) as environment and requirements dictate providing users with performance/cost flexibility and connection reliability. One significant limitation to all IP mobile networks is the handover performance of the normal Mobile IP4 and Mobile IPv6 protocols. These have handover times of several seconds and so are not capable of supporting continuous data transfer or real time applications such as telephony. Monash University's Centre for Telecommunications and Information Engineering (CTIE) is working on improved handovers for Mobile IPv6 through implementing proposed improvements with the aim of supporting real time applications. This seminar presents the obstacles to improved Mobile IP handovers and the solutions that are being implemented at CTIE to overcome them.

 

The Application of Unstructured Learning Techniques to Bioinformatics and Conceptual Biology

Dr Tony Smith
Department of Computer Science, The University of Waikato
Tuesday, 13 August 2002
Biotechnology tops the list of both publicly and privately funded research projects in many countries around the world. The arduous job of transcribing genes and proteins has been in full-gear for a long time, creating an abundance of raw biochemical data. But the essential analysis of all that primary sequence information continues to lag ever-further behind. Computer science (in particular its machine learning and data mining techniques) continues to offer significant promise as a means to help speed up genomic/proteomic analysis. Heuristic pattern-matching, dynamic programming, neural networks, and so forth have been the dominant methods employed, and have proven effective for detecting structural regularities and dependencies. But they are entirely unsuited to address the paramount goal of biochemical research: the prediction of genetic roles and protein functions. This talk will describe how unstructured learning techniques can solve a wide range of bioinformatic problems, including those both solvable and unassailable by structured approaches. Real-time demonstrations will be used to show how a system designed to categorize unstructured textual documents can be made to predict such things as glycosylation sites and signal cleavage points in proteins, and correlate functional similarities for disparate proteins after the fashion of conceptual biology.

 

Inductive Databases for Bioinformatics and Predictive Toxicology

Dr Stefan Kramer
Institute for Computer Science, University of Freiburg
Friday, 19 July 2002
In the talk, I will highlight the opportunities for special-purpose inductive databases in bioinformatics and predictive toxicology. Inductive databases are databases that can not only be queried for the data, but also for the patterns and regularities that hold in it. A recent development is the Molecular Feature Miner (MolFea), an inductive database that can be used to find linear molecular fragments of interest, given some dataset of chemical compounds. MolFea has been successfully applied to a number of interesting problems in Predictive Toxicology and to a large dataset of HIV data. Finally, extensions of MolFea and another instantiation of the inductive database framework for protein data will be presented.

 

AI on Mars: Autonomy for Planetary Rovers

Dr Richard Dearden
Research Institute for Advanced Computer Science, NASA Ames Research Center
Tuesday, 16 July 2002
NASA's planned 2009 rover mission to Mars has more ambitious goals than anything attempted previously. These goals can only be achieved by giving the rover more autonomy than ever before. In this talk I will describe the mission goals and the various AI technologies we are developing in order to meet them. I will particularly concentrate on two technologies, on-board diagnosis and contingency planning. On-board diagnosis is the problem of determining and tracking the current state of a device. Planetary rovers are a particularly difficult domain for diagnosis because their behaviour is strongly influenced by the terrain they are driving across, so normal behaviour in one situation can appear very similar to a fault in another. I will describe work on hybrid diagnosis using particle filters that tackles this problem. The second topic I will discuss is contingency planning. Here the problem is to develop flexible plans with branches, so that the rover can continue to perform science even when its primary goal cannot be accomplished. To do this, we must reason about uncertainty in the effects of actions and the amounts of resources such as power that they consume. I will present work aimed at creating a planner that produces branching plans that handle multiple resources, concurrent actions, and actions with uncertain durations and effects.

 

Code Red: Spread and Victims of an Internet Worm

Ms Colleen Shannon
Cooperative Association for Internet Data Analysis
Tuesday, 9 July 2002
On July 19, 2001 more than 359,000 computers were infected with the Code-Red (CRv2) worm in less than 14 hours. The cost of this epidemic, including subsequent strains of Code-Red, is estimated to be in excess of $2.6 billion USD. Despite the global damage caused by this attack, there have been few serious attempts to characterize the spread of the worm. To this end, we collected and analyzed data from a /8 network over a period of 45 days beginning July 2nd, 2001 to determine the characteristics of the spread of Code-Red. We begin our analysis with a look at the initial spread of Code-Red. We examine in detail the infection rate of the worm and its fit to an analytic model of worm spread. To understand the rate at which infected machines were repaired, we measured the patch rate of compromised machines during the subsequent month. We then take an in-depth look at the victims of the Code-Red worm by analyzing the properties of infected hosts, including geographic locations and diurnal patterns of infection. We also quantified the effects of DHCP on measurements of infected hosts and determined that IP address counts are not an accurate measure of the spread of a worm on timescales longer than 24 hours. Although most media coverage focused on large corporations, the Code-Red worm preyed upon home and small business users.

 

Fundamental Limits on Blocking Self Propagating Code

Mr David Moore
Cooperative Association for Internet Data Analysis
Tuesday, 9 July 2002
We have some preliminary results on fundamental limits on the ability to protect against worm spread or other forms of self-propagating code on the Internet. In particular, we have found that even with perfect ability to detect and block self-propagating code, the reaction time necessary may be shorter than feasible in the current Internet. Our results also show there may be some ability for portions of the network to provide partial protection for themselves.

 

Modeling for Optimal Probability Prediction

Dr Yong Wang
Department of Computer, The University of Waikato
Tuesday, 2 July 2002
We present a general modeling method for optimal probability prediction over future observations, in which model dimensionality is determined as a natural by-product. This new method yields several estimators, and we establish theoretically that they are optimal (either overall or under stated restrictions) when the number of free parameters is infinite. As a case study, we investigate the problem of fitting logistic models in finite-sample situations. Simulation results on both artificial and practical datasets are supportive.

 

Text Mining with Information Extraction

Professor Raymond J. Mooney
Department of Computer Sciences, University of Texas
Friday, 28 June 2002
Information extraction (IE) is a form of shallow text understanding that locates specific pieces of data in natural language documents. An IE system is therefore capable of transforming a corpus of unstructured texts or web pages into a structured database. Our previous work has focused on using machine learning methods to automatically construct IE systems from training sets of manually annotated documents. Our current research focuses on a form of text mining that extracts a database from a document corpus using a learned IE system and then mines this database for interesting patterns using rule induction. The noise and variation in automatically extracted text requires rule mining methods that allow 'soft' matching to the data based on textual similarity. We have developed two methods for inducing 'soft matching' rules from textual data, one based on integrating rule induction and nearest-neighbor learning and another based on modifying association rule mining. Results on several extracted datasets will be presented.

 

SCMS Unix Infrastructure

TSG (Linux)
Department of Computer Sciences, The University of Waikato
Tuesday, 25 June 2002
A brief overview of the unix systems and servers maintained by the School. We will highlight the changes and diversification of services over the last 2-3 years as the Unix infrastructure has migrated from 3 Solaris boxes to approximately 10 Linux machines. Details of what new facilities are available to staff and grads will be given, as well as an overview of works in progress. Systems security will also be discussed.

 

A Sustainable Infrastructure in Support of Digital Libraries for Remote Communities

Mr Cameron Esslemont
Global Library Services Network
Tuesday, 21 May 2002
Global Library Services Network (GLSN) is a 'managed infrastructure' provider in support of the deployment of digital libraries to remote communities. The basic model is based on a central repository of digital assets both 'free' and 'valued'. The assets are catalogued to Dublin Core Metadata standards and support the integration of Digital Object Identifiers (DOI). Libraries, which are subject specific, are then built dynamically for communities and they can either be 'open' (available to all at no charge) or 'subscription' (available at a nominal charge to members of defined communities). The libraries support a mix of assets .html, .pdf, and both streaming and local audio and video. The digital engine that supports the libraries is Greenstone and it has been slightly modified to cater for our requirements. It is presented in an administrative shell that allows each user to have his personal view and control of their libraries. A library member can, through the catalogue record for each asset order valued material. This is compiled and delivered in the form of a E-Book with appropriate digital rights. The libraries are available on the Internet or on a removable hard disks (85Gb). The upgrade mechanisms include a range of possibilities from Internet download, disk swap through a digital farm, or IP authenticated satellite broadcast. The talk will cover the deployment architecture and the extensions required to deliver multilingual, multi-asset search (PHRONESIS) and the ongoing support for dynamic indexing.

 

Waikato Visits China

Geoff Holmes and Mark Utting
Department of Computer Sciences, The University of Waikato
Tuesday, 14 May 2002
Last week Mark Utting, Geoff Holmes and Ian Graham visited Beijing and Shanghai. This talk will detail their experiences both social and academic. The two institutions visited were Beijing Polytechnic University (BPU) which has a substantial software engineering programme and Shanghai International Students University (SISU) which wants to startup a computer science programme.

 

Starlog Group Research Talk: Optimization and Compliation of Data Structures

Professor John Cleary
Department of Computer Sciences, The University of Waikato
Tuesday, 30 April 2002
A major topic in Computer Science is the automatic compilation and optimization of control. However, little or no effort has been expended on the complementary problem of selecting and optimizing data structures for use in a program. Thus we notice that in all modern programming languages (including the major families of imperative, functional and logic languages) the layout of data in memory is explicitly determined by what the programmer writes. When designing code the programmer is forced into a premature commitment to the way the data will be represented and laid out. The problem that is considered here is selecting an optimized data structure given the way that data is used in a program. We have been using a logic programming language (Starlog) as a data structure neutral notation for data access and manipulation. We will show some of the techniques we use for selecting efficient data structures and eventually generating Java code from Starlog programs.

 

Symbolic model simplification

Dr David Streader
University of Sydney, Australia
Tuesday, 19 March 2002
We show how to simplify a symbolic transition system to remove 'unobservable' internal or silent actions while preserving the traces of the system. A corresponding simplification problem has been studied before for transition systems with handshake communication, an interleaving interpretation of parallel composition and bisimulation equivalence, but the framework we use, of non-blocked communication and trace equivalence, is important because it has been widely accepted for representing distributed applications; also the simplifications possible are much more powerful in this setting.

We have embodied the new simplification algorithms in a tool, and we demonstrate our achievements by using the tool on a well-known leader election algorithm, where the simplification reduces the algorithm (expressed as a composition of processes) until it consists of a single symbolic transition, whose correctness is evident by inspection.

Finally we discus our approach with handshake communication, a non interleaving interpretation of parallel composition and failure equivalence.

 

The Use of Auditory Feedback in Call Centre CHHI

Ms Anette Steel
Department of Computer Science, The University of Waikato
Tuesday, 12 March 2002
Initial investigations have been carried out to evaluate issues of the computer-human-human interaction (CHHI) commonly found in call centre scenarios. These investigations suggest some benefits in the use of auditory icons and earcons.

This presentation is similar to the one that I have given before (on the 14 November 01). However, this one is short (10 mins) and it is rehearsal for CHI. I would really appreciate some feedback on the presentation, as this conference is a big one.

 

Information management and software engineering research at Lund University

Mr Thomas Olsson
Department of Communication Systems, Lund University
Tuesday, 5 March 2002
The software developer is faced with more and more information, as new systems become larger and larger. Along with this there is a growing demand on time to market and increased quality. As the amount and diversity of information about software systems grows, so does the need for supporting consistency and traceability at different levels of abstraction for developers. The approach taken here is a top-down approach with the focus on utilizing the overlap of information to relate information across artefacts. The research has an empirical basis with a focus on high-level documents.

The software engineering group in Lund, Sweden, has a strong empirical basis in the research. The focus is on applied research close with industry. Current areas of research include Requirements Engineering, Verification & Validation, Software Quality Process and Software Architecture.

The research agenda is presented along with an introduction to the software engineering research group in Lund, Sweden.

 

Applications of Character Shape Coding

Dr Larry Spitz
Document Recognition Technologies, Inc., Palo Alto, CA
Tuesday, 19 February 2002
There is a considerable amount of technology available for processing documents in character coded form, and somewhat considerably less for processing of document images. The usual transformation made to the document image is to process it using Optical Character Recognition (OCR). But there are instances where knowledge of the document is required before a good job of OCR can be performed, and others where the computational overhead of OCR may not be justified.

We have developed a simple, robust and computationally inexpensive method of characterizing the shape of Roman characters. While not nearly as information rich as OCR output, a number of applications are adequately served by use of the character shape codes and their associated word shape tokens. I will describe three classes of applications: language identification, information retrieval and document style characterization.

Using word shape tokens, we have developed an automated means of detecting which of 24 languages is represented in a document image. This is particularly useful as a pre-process for OCR.

We have found the concatenation of character shape tokens as a novel index of document images and find that we can search databases of document images for the presence of keywords rapidly and robustly.

Additionally, we have done some work on part-of-speech tagging of documents encoded using this technique. Since the mapping of source characters to shape codes is (almost) one-to-one, traditional measures of document content such as length, number of words, average word length, etc. are preserved.

 

Web and Internet Measurement Research

Dr Balachander Krishnamurthy
AT&T Labs—Research
Tuesday, 22 January 2002
I have been measuring various aspects of the Web and the Internet as part of improving the user's Web experience while reducing load on the network. A global infrastructure has been built around this effort that has led to gathering of data at various levels of the protocol stack. I will present a quick overview of various research projects that I have been involved in over the last four years. The remainder of the talk will focus on some recent examination of netflow data to characterise DNS traffic, via passive and active measurements as well as a graph-based analysis.

 

Interactive Document Summarisation Using Automatically Extracted Keyphrases

Dr Steve Jones
Department of Computer Science, The University of Waikato
Tuesday, 15 January 2002
This talk describes the Interactive Document Summariser (IDS). IDS provides dynamic control over document summary characteristics, such as length and topic focus, so that changes made by the user are instantly reflected in an on-screen summary. 'Summary-in-context' views allow users to move flexibly between summaries and their source documents. IDS adopts the technique of sentence extraction, exploiting keyphrases that are automatically extracted from document text as the primary attribute of a sentence extraction algorithm. We report an evaluation of IDS summaries, in which representative end-users of on-line documents identified relevant summary sentences in source documents. IDS summaries were then compared to the recommendations of the users and we report the efficacy of the summaries based on standard precision and recall measures. In addition, using established evaluation metrics we found that IDS summaries were better than baseline summaries based on within-document sentence ordering.

 


Events Index