A DEEP LEARNING SIMILARITY-CHECKING METHOD THAT CAN IDENTIFY PATTERNS OF RESEMBLANCE IN DUPLICATED QUESTIONS CAN BE USED TO COMBAT THE PROBLEM OF PLAGIARISM

I. Bandara; F. Ioras

doi:10.21125/iceri.2023.1464

DIGITAL LIBRARY

A DEEP LEARNING SIMILARITY-CHECKING METHOD THAT CAN IDENTIFY PATTERNS OF RESEMBLANCE IN DUPLICATED QUESTIONS CAN BE USED TO COMBAT THE PROBLEM OF PLAGIARISM

I. Bandara

F. Ioras²

¹ Open University (UNITED KINGDOM)
² Buckinghamshire New University (UNITED KINGDOM)

About this paper:

Appears in: ICERI2023 Proceedings
Publication year: 2023
Pages: 5876-5884
ISBN: 978-84-09-55942-8
ISSN: 2340-1095
doi: 10.21125/iceri.2023.1464

Conference name: 16th annual International Conference of Education, Research and Innovation
Dates: 13-15 November, 2023
Location: Seville, Spain

Abstract:

The rise in duplicate questions has an impact on a variety of applications, including tests and assignments in higher education programmes, in addition to degrading the quality of the content. Therefore, it is crucial to address the problem of question duplication. In this research, we present a novel deep learning similarity checking method for efficiently detecting question duplication. With the help of deep neural networks, our method is able to fully comprehend both the semantic content and syntactic structure of questions. This makes it possible to compare and categorise question pairs precisely.

The representations obtained from the suggested method are used by the similarity measurement module to calculate the similarity score between pairs of questions. To account for multiple aspects of similarity, it integrates numerous similarity measures including cosine similarity and word embedding distances. The method also makes use of an attention mechanism to highlight key elements of the questions that significantly contribute to their shared sentence structure.

A sizeable dataset made up of question pairs with tagged duplications is used to evaluate and validate the effectiveness of the suggested strategy. This dataset includes both identical and paraphrased question pairings and spans a wide range of subjects. The proposed approach is thoroughly tested through considerable experimentation, proving its effectiveness and robustness. The outcomes demonstrate cutting-edge performance on benchmark datasets.

In order to successfully eliminate question duplication, this research introduces a precise and effective deep learning similarity checking method. The method consists of two main parts: a similarity measuring module and a multi-layered filter system that extract high-level features from input questions. This approach has the potential to improve the calibre and usability of question-based platforms by successfully capturing both syntactic structure and semantic content. Future research projects can examine the method's applicability in multilingual and multimodal environments as well as its integration into practical systems.

Keywords:

Syntactic structure, Question-based platforms, Multimodal contexts, Duplicated questions, Deep learning similarity checking, Cosine similarity.

About this paper:

Abstract:

Keywords:

Citation