<XML><RECORDS><RECORD><REFERENCE_TYPE>3</REFERENCE_TYPE><REFNUM>7184</REFNUM><AUTHORS><AUTHOR>Hunt,E.</AUTHOR><AUTHOR>Pafilis,E.</AUTHOR><AUTHOR>Tulloch,I.</AUTHOR><AUTHOR>Wilson,J.</AUTHOR></AUTHORS><YEAR>2004</YEAR><TITLE>Index-driven XML data integration to support functional genomics</TITLE><PLACE_PUBLISHED>Data Integration in the Life Sciences, DILS04 </PLACE_PUBLISHED><PUBLISHER>Springer Verlag</PUBLISHER><PAGES>95-109</PAGES><ISBN>3-540-21300-7</ISBN><LABEL>Hunt:2004:7184</LABEL><KEYWORDS><KEYWORD>bioinformatics</KEYWORD></KEYWORDS<ABSTRACT>We identify a new type of data integration problem which arises in functional genomics research, in the context of large-scale experiments involving arrays, 2-dimensional protein gels and mass-spectrometry. We explore the current practice of data analysis which involves repeated web queries iterating over long lists of gene or protein names. We postulate a new approach to solve this problem, applicable to data sets stored in XML format. We propose to discover data redundancies using an XML index we construct, and to remove them from the results returned by the query. We combine XML indexing, with queries carried out on top of relational tables. We believe our approach could support semi-automated data integration, as required in the interpretation of large-scale biological experiments. </ABSTRACT><NOTES>LNCS vol 2994</NOTES></RECORD></RECORDS></XML>