<XML><RECORDS><RECORD><REFERENCE_TYPE>3</REFERENCE_TYPE><REFNUM>9168</REFNUM><AUTHORS><AUTHOR>Polajnar,T.</AUTHOR><AUTHOR>Rogers,S.</AUTHOR><AUTHOR>Girolami,M.</AUTHOR></AUTHORS><YEAR>2009</YEAR><TITLE>Classification of Protein Interaction Sentences via Gaussian Processes</TITLE><PLACE_PUBLISHED>Lecture Notes in Bioinformatics, Proceedings of 4th IAPR International Conference, Pattern Recognition in Bioinformatics 2009</PLACE_PUBLISHED><PUBLISHER>Springer Verlag</PUBLISHER><PAGES>282–292</PAGES><ISBN>978-3-642-04030-6</ISBN><LABEL>Polajnar:2009:9168</LABEL><KEYWORDS><KEYWORD>text; classification; gaussian process; kernel; text mining</KEYWORD></KEYWORDS<ABSTRACT>The increase in the availability of protein interaction studies in textual format coupled with the demand for easier access to the key results has lead to a need for text mining solutions. In the text processing pipeline, classification is a key step for extraction of small sections of relevant text. Consequently, for the task of locating protein-protein interaction sentences, we examine the use of a classifier which has rarely been applied to text, the Gaussian processes (GPs). GPs are a non-parametric probabilistic analogue to the more popular support vector machines (SVMs). We find that GPs outperform the SVM and naive Bayes classifiers on binary sentence data, whilst showing equivalent performance on abstract and multiclass sentence corpora. In addition, the lack of the margin parameter, which requires costly tuning, along with the principled multiclass extensions enabled by the probabilistic framework make GPs an appealing alternative worth of further adoption.</ABSTRACT></RECORD></RECORDS></XML>