<XML><RECORDS><RECORD><REFERENCE_TYPE>3</REFERENCE_TYPE><REFNUM>7081</REFNUM><AUTHORS><AUTHOR>Japp,R.P.</AUTHOR></AUTHORS><YEAR>2003</YEAR><TITLE>Persistent Indexing Technology for Large Sequences</TITLE><PLACE_PUBLISHED>Technical Report of the 20th British National Conference on Databases, BNCOD 20, July 15-17 2003, Poster Papers. </PLACE_PUBLISHED><PUBLISHER>Coventry University</PUBLISHER><PAGES>8-11</PAGES><ISBN>1-903818-31-1</ISBN><LABEL>Japp:2003:7081</LABEL><KEYWORDS><KEYWORD>Persistent Indexes</KEYWORD></KEYWORDS<ABSTRACT>There are two aspects to the work being presented here. The first is a novel persistent index structure for genomic data, a prototype of which has been completed. The second, using this index as an example, is a generic index development framework, which is under construction. We propose a variation of the suffix tree, the Top Compressed Suffix Tree, which has been designed to allow the on-disk construction of indexes over multi-gigabyte sequences. This form of the suffix tree extends the work of Hunt et al. by improving the performance of the partitioned construction algorithm when the size of the sequence being indexed is comparable to that of the available main memory, and by providing a compact representation of the index on secondary memory. This work forms part of the GIDOF project---a project to provide a Generic Index Development and Operation Framework. GIDOF addresses the management of performance-critical parameters, automatic parameter exploration and tuning, and the provision of generic persistence components. </ABSTRACT></RECORD></RECORDS></XML>