DIGITAL LIBRARY
DETECTING PLAGIARISM IN SHORT SOLUTIONS OF PROGRAMMING EXERCISES
University of Szczecin (POLAND)
About this paper:
Appears in: ICERI2018 Proceedings
Publication year: 2018
Pages: 450-455
ISBN: 978-84-09-05948-5
ISSN: 2340-1095
doi: 10.21125/iceri.2018.1093
Conference name: 11th annual International Conference of Education, Research and Innovation
Dates: 12-14 November, 2018
Location: Seville, Spain
Abstract:
Exercises requiring the students to write code on their own form an intrinsic part of teaching computer programming. Failing to pass an exercise should motivate the students to comprehend the errors they made and learn new skills and knowledge they were found to lack. In the context of formal education, some students find it more preferable to pass an exercise by submitting someone else’s solution passing it off as their own rather than invest their time in learning what is needed and actually solve it (see [1, p. 4]). Doing so is especially easy in the days of on-line programming learning environments featuring automated assessment, where the submitted solutions are not inspected by an instructor on a regular basis. Although morally dubious, such behavior has been [2] and remains [3] widespread.

There is a number of code similarity detection systems available that help identify plagiarized solutions (see Table 1 in [4] for their comparison). Unfortunately, code similarity detection systems are not the right tools to identify plagiarism in the case of introductory programming courses. One reason for this is that such systems are not effective when the amount of individually contributed code is small [5, p. 20], and introductory programming courses consist of exercises which are not only simple (hence their solutions are short) but also often provide the students with initial code that they only have to modify slightly to obtain the solution. Secondly, such exercises often define rigid requirements for the acceptable solutions (such as the set of programming instructions to use) and provide the students with detailed hints, leading to many students submitting identical solutions even though there was no plagiarism involved.

In order to indicate possible plagiarism cases in such circumstances, metrics describing the process of editing the solution code can be used [6, p. 49]. In this paper, we follow this approach, proposing a scheme for detecting possible plagiarism based on merely three such metrics (time of submission, total solution editing time, total number of edits). We test the usefulness of the proposed scheme on a database of 9782 programming exercise solutions submitted by 155 students.

References:
[1] S. Abraham and G. Milligan, "Software plagiarism in undergraduate programming classes" in Proceedings of Information Systems Education Conference 2008, paper 3344. Phoenix, AZ: EDSIG, 2008.
[2] J. Sheard, M. Dick, S. Markham, I. Macdonald, and M. Walsh, “Cheating and plagiarism: perceptions and practices of first year IT students,” ACM SIGCSE Bulletin, vol. 34, no. 3, pp. 183–187, Sep. 2002.
[3] J. Pierce and C. Zilles, “Investigating Student Plagiarism Patterns and Correlations to Grades,” in Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education, pp. 471–476. New York, NY: ACM, 2017.
[4] M. J. Misic, J. Z. Protic, and M. V. Tomasevic, “Improving source code plagiarism detection: Lessons learned,” in 2017 25th Telecommunication Forum (TELFOR), pp. 1–8. Belgrade, Serbia: IEEE, 2017.
[5] M. Joy, N. Griffiths, and R. Boyatt, “The boss online submission and assessment system,” Journal on Educational Resources in Computing, vol. 5, no. 3, article 2, Sep. 2005.
[6] F. Hattingh, A. A. K. Buitendag, and J. S. Van Der Walt, “Presenting an alternative source code plagiarism detection framework for improving the teaching and learning of programming,” Journal of Information Technology Education, vol. 12, pp. 45–58, 2013.
Keywords:
Introductory programming education, programming exercise assessment, automatic plagiarism detection, plagiarism metrics.