COMPARATIVE ANALYSIS OF PREDICTVE MODELS ON ONLINE EDUCATION IN CONTEXT OF COVID-19 – A CASE STUDY
1 Stefan cel Mare National College of Suceava (ROMANIA)
2 Stefan cel Mare University of Suceava (ROMANIA)
About this paper:
Conference name: 15th International Technology, Education and Development Conference
Dates: 8-9 March, 2021
Location: Online Conference
Abstract:
In the context of the COVID-19 pandemic, Romania, like other countries in the world, has moved from face-to-face education to online education. This situation has generated many problems: physical - regarding the infrastructure, educational and emotional. Consequently, there was some tension between the main categories involved in this process: students, teachers and parents.
In order to study their opinions about the new form of education, non-specific to the pre-university system, we conceived 3 questionnaires, one for each category. These totalized 66 questions about: the advantages and disadvantages of online education, the interest of students and teachers in lessons, platforms and applications used, the material problems identified, the environment from which the respondents comes. The total of respondents who answered our questions was: 1,085 students, 784 parents and 956 teachers from all over the country.
Our current fields of research in doctoral studies are data mining and applications of big data in the pre-university educational system from Romania. At this time, we don't know about of any important research related on these. We set out to study, for a start, the applications of machine learning for discovering knowledge from data.
The main objective of this paper is to analyze the performance of the algorithms for classifying the answers collected according to the category which the respondents belong.
Our analysis started from the observation that there are certain patterns in the opinions of students, parents and teachers. We wondered how well classification algorithms would be able to identify these patterns. Therefore, in our paper, we aimed to make a comparative study on the performance of the following classification algorithms: SVM, AdaBoost, Neural Network and Naive Bayes. To compare performance we use a cross-validation technique.
The data used were pre-processed by filtering and removing some features and incomplete responses. The result was a single file, in Excel format. It was entered into the workflow. For this purpose we used Orange, an open source tool for machine learning and data visualization, version 3.27.0. To estimate the accuracy of cross-validation we used Test and Score. To visualize the results we used the confusion matrix.
The data set was divided into training data and test data in proportion of 66%.
To display the model predictions for the data used, we included in the workflow the widget Predictions, which we connected to the Data Table in order to view the results in a spreadsheet.
For each algorithm used, we calculated the following performance statistics: AUC, CA, F1, Precision and Recall.
The best results were obtained with the Neural Network and Naive Bayes algorithms that classify teachers with an accuracy between 0,993 and 0,997. We obtained the weakest scores with the SVM algorithm, for the classification of students and parents, with values from 0,527 to 0,866.
The confusion matrix for the Neural Network algorithm reveals the correct classification of: 68,2% of parents, 89,6% of students and 99% of teachers.
We also noticed that by introducing in the input data set the characteristic referring to the environment from which the respondents comes (rural, urban), we obtained better results with all the classification algorithms used.Keywords:
Data mining, predictive models, classification algorithms.