DIGITAL LIBRARY
A TAXONOMY OF PLAGIARISM IN COMPUTER SCIENCE
1 University of Warwick (UNITED KINGDOM)
2 PA College (CYPRUS)
About this paper:
Appears in: EDULEARN09 Proceedings
Publication year: 2009
Pages: 3372-3379
ISBN: 978-84-612-9801-3
ISSN: 2340-1117
Conference name: 1st International Conference on Education and New Learning Technologies
Dates: 6-8 July, 2009
Location: Barcelona ,Spain
Abstract:
Many on-line resources exist for testing students’ knowledge of plagiarism, however few of these cover both text and source code plagiarism in a comprehensive manner to encompass all types of plagiaristic activity of relevance to computing students and academics. In order to provide suitable resources it is useful to identify and categorize aspects of text and code plagiarism so that, for example, quizzes can be generated which ensure coverage of each important topic. This paper reports the results of a taxonomic analysis of data collected from sources relating to plagiarism, including existing on-line quizzes and previous research, in order to inform the construction of a quiz generation system which covers all areas of plagiarism relating to a computing course.

The principal aim of this research was to identify a taxonomy of issues relating to student (and academic) plagiarism, so that a resource could be built which can accurately assess a student's understanding of what plagiarism means and how it can be avoided. Such a resource would target computing students, and cover source code topics in addition to the generic plagiarism issues of importance to students in other disciplines. The taxonomy reported here allows us to construct representative question sets for use in such a resource, and to present formative material to students which addresses their individual misunderstandings.

Our methodology for constructing the taxonomy initially involved collecting data from two types of source. The first consisted of on-line interactive resources, such as student-focused plagiarism questionnaires which were created for testing students’ knowledge of plagiarism and for providing informative feedback to students based on their responses. We identified 23 which were publicly accessible, and which together contained 268 questions. The second type of source data is represented by published work, and included books on plagiarism and on “academic misconduct”, and conference and journal publications, some of which focus on source code plagiarism.

The quiz data were analyzed using facet analysis, in order to identify discrete categories into which the questions might be classified. This provided a comprehensive overview of what types of question have been used for testing students’ understanding of plagiarism in a generic context. The other data were then used to refine the classification by incorporating the major issues which currently are important to computing students and academics.

The resulting taxonomy consists of 6 categories (Plagiarism and copying, Referencing, Cheating and inappropriate collaboration, Ethics and consequences, Source code plagiarism, and Source code documentation) subdivided into 23 subcategories.

At the time of writing, an online tool has been written and contains both tutorial material and over 200 questions arranged according to the above categorization. Although primarily the tool generates quizzes relating to source code plagiarism, it can be adapted to generate quizzes relating to topics associated with other university courses.
Keywords:
computer science education, plagiarism, taxonomy, source code.