Affiliation : School of Computer Science and Software Engineering, Monash University
Computer Science Seminar Room, G1.15
Suffix trees are very versatile data structures that are used in many areas of computing including string-matching, DNA-matching, file-compression, etc. One of the main arguments against suffix trees is that they are greedy for space. Numerous attempts have been made in the past to improve the space-efficiency of suffix trees but most of these attempts resulted in representations tailored to specific problems.
The suffix vector representation is a general representation, which can be used in any application area where a suffix tree is applicable. The suffix vector is more space-efficient than any other suffix tree representations that have the same versatility.
Our data structure eliminates the redundancies present in other suffix tree representations, which not only saves space but also results in faster runs of certain algorithm because they do not have to analyse redundant parts of the tree.
Suffix trees and suffix vectors have efficiently been used in our document comparison prototype system: MatchDetectReveal (MDR). A short demonstration of this system will also be presented.