ON THE AUTOMATED DETECTION OF MARGINAL PLAGIARISM IN WEB-BASED STUDENT ASSIGNMENTS
There are many benefits stemming from the usage of New Technologies in Higher Education. Among them, web-based assignments can provide a large degree of flexibility to the students at the time of organizing their work, since they can access the host application from any location at any time. An ideal learning set-up is obtained when web-based assignments and virtual labs are combined in a single platform . Some other remarkable advantages are provided by this type of e-learning systems: the use of logs to the web application permits to track the activity of the students in high detail and, also, no paper waste is generated in the evaluation procedure that can be performed on-line.
In this presentation, we focus on a particular feature of web-based student assignments: the automated detection of partial plagiarism among the students. It happens that, for very specific tasks, the students may share their work material among them or use the results from students having taken the course in previous years. Since all the information regarding the work presented by the students is kept in electronic format, we have an ideal testbed for the essay of information processing techniques in the detection of this mild unethical behaviour. We have found that this undesirable conduct is, in fact, largely limited to very specific parts of the subject under test and, most importantly, the proposed procedure permits to identify those parts of the syllabus where the students find a higher difficulty level which, in turn, allows to undertake the necessary actions in order to improve the learning performance.
We present a detailed account of a experience based on a web-based laboratory  where a marginal plagiarism detection scheme is implemented based on both the Levenshtein Distance  and the Kolmogorov Complexity based Information Distance  distances. The correct operation of the system requires a careful setting of the distance normalizations and the thresholds for automated functioning. The evaluation system implemented enables the accurate identification of the specific offending cases, to trace all the relations existing for all the cases and, most importantly, to spot those aspects of the subject that may be particularly difficult for the students.
. P. Chamorro-Posada, “Web-based computer lab for teaching dispersion managed solitons,” Research, Reflections and Innovations in Integrating ICT in Education pp. 1498-1502 (2009).
. V. Levenshtein, “Binary codes capable of correcting spurious insertions and deletions of ones”, Probl. Inf. Transmission 1, 8–17 (1965).
. Ming Li and Paul Vitanyi, An Introduction to Kolmogorov Complexity and Its Applications, Third Edition, Springer Verlag (2008).