Technical University of Lublin (POLAND)
About this paper:
Appears in: ICERI2015 Proceedings
Publication year: 2015
Pages: 2759-2768
ISBN: 978-84-608-2657-6
ISSN: 2340-1095
Conference name: 8th International Conference of Education, Research and Innovation
Dates: 18-20 November, 2015
Location: Seville, Spain
Plagiarism is one of the most complex issues teachers and researchers face in connection with any kind of publishing activity. Electronic media, the widespread of the Internet availability and the ease of use of computer hardware and applications are just some of the reasons why the phenomenon of plagiarism becomes more and more significant. In response to this situation a whole range of anti-plagiarism applications and systems have been developed for recent years. These kind of tools help teachers check students' works for incorrect citations or possible plagiarism through a very accurate comparison of the content of the work with the contents of the database. In the context of the system of this kind, the key issues are effectiveness (complexity) of the analysis algorithm (analysis of text and other elements of a publication), performance of hardware platform on which this algorithm is implemented and finally the completeness and adequacy of databases which are used.

The premise for our article is the last issue. Whichever school, university or research center creates the local community and with it local resources of the knowledge and data (local databases). Access to them may be limited, local databases` resources can only be partially indexed or not indexed at all. For this reason, the article presents the discussion of the possibility and expediency of construction of a LAS (Local Anti-plagiarism System). Analysis includes the following topics: functionality distinguishes the LAS from other wide available anti-plagiarism solutions, the specificity of reference database (a collection of works and materials for use by LAS) and the role of social networking tools for the security and reliability of the LAS. All presented findings serve as a starting point for analyzing the structure of the LAS system based on Hadoop and MapReduce. Particular emphasis is placed on proper implementation of the distributed file system which is responsible for process of so called moving processing to the data (managing the data over which MapReduce operates). In addition, the article provides discussion of the requirements for the hardware platform tailored to the specifics of described LAS solution. All of the conclusions and proposals contained in the article base on the analysis of the test LAS. It uses solely software elements available as an OpenSource. With this approach. individuals or universities interested in the expansion of the functionality of the systems used for the plagiarism preventions will be able to choose and develop the reasonable platform for their own future tests.
Anti-plagiarism, Hadoop cluster, MapReduce, distributed file system, big data processing.