Paper ID: 7444
DCS Tech Report Number: TR-2004-164
Plagiarism and collusion detection using the Smith-Waterman algorithm
Tech Report (internal)
DCS Technical Report
Page Numbers : 1-24
Publisher: Dept of Computing Science, University of Glasgow
We investigate the use of variants of the Smith-Waterman algorithm to locate similarities in texts and in program source code, with a view to its application in the detection of plagiarism and collusion. The Smith-Waterman algorithm is a classical tool in the identification and quantification of local similarities in biological sequences, but we demonstrate that somewhat different issues arise in this different context, and that these factors can be exploited to yield significant speed-up in practice. We include empirical evidence to indicate the utility of the approach and to illustrate the efficiency gains.