<XML><RECORDS><RECORD><REFERENCE_TYPE>10</REFERENCE_TYPE><REFNUM>7444</REFNUM><AUTHORS><AUTHOR>Irving,R.W.</AUTHOR></AUTHORS><YEAR>2004</YEAR><TITLE>Plagiarism and collusion detection using the Smith-Waterman algorithm</TITLE><PLACE_PUBLISHED>DCS Technical Report</PLACE_PUBLISHED><PUBLISHER>Dept of Computing Science, University of Glasgow</PUBLISHER><PAGES>1-24</PAGES><ISBN>TR-2004-164</ISBN><LABEL>Irving:2004:7444</LABEL><ABSTRACT>We investigate the use of variants of the Smith-Waterman algorithm to locate similarities in texts and in program source code, with a view to its application in the detection of plagiarism and collusion. The Smith-Waterman algorithm is a classical tool in the identification and quantification of local similarities in biological sequences, but we demonstrate that somewhat different issues arise in this different context, and that these factors can be exploited to yield significant speed-up in practice. We include empirical evidence to indicate the utility of the approach and to illustrate the efficiency gains. </ABSTRACT></RECORD></RECORDS></XML>