DIGITAL LIBRARY
AN EARLY PREDICTION OF ACADEMIC SUCCESS USING MULTI-SOURCE DATA MINING AND MACHINE LEARNING ALGORITHMS FOR STUDENTS TAUGHT IN ONLINE AND HYBRID MODES AFTER COVID-19
Queen's University Belfast (UNITED KINGDOM)
About this paper:
Appears in: INTED2023 Proceedings
Publication year: 2023
Page: 7374 (abstract only)
ISBN: 978-84-09-49026-4
ISSN: 2340-1079
doi: 10.21125/inted.2023.2010
Conference name: 17th International Technology, Education and Development Conference
Dates: 6-8 March, 2023
Location: Valencia, Spain
Abstract:
This paper outlines a multi-source data-driven solution to the problem of early predication of academic success of students on a Software Development programme which is delivered in both online and hybrid teaching modes.

Academic failure and subsequent dropout of students enrolled on computer science & software development (programming-based courses) are significant problems, with some UK institutions reporting first year dropout rates of 11%. A key strategy to combat academic failure and dropout is to provide timely and meaningful interventions, directly targeted to the students that need them most. Central to this is a requirement to quickly and accurately identify the students that require such interventions.

Many of the existing approaches either identify students at a point that is too late to deploy any meaningful academic intervention, or if they do make an early prediction they do so with pre-matriculation data alone which often does not provide a clear enough indicator of academic performance; accordingly such systems have a low chance of successfully predicting which students will struggle academically.

Our approach mines multiple sources of data simultaneously. We make use of a diverse range of inputs, including: pre-matriculation socio-demographic data variables (gender, age, ethnicity, education and disability); pre-course aptitude test scores; results from weekly summative assessments; interim results from formative assessments; attendance data from online lectures; and Learning Management System (LMS) Activity Data.

Another problem with existing approaches is that predictions can quickly become stale. The learning path of a student is rarely linear, rather their academic ability will develop at key points during the lifetime of the course they are undertaking. Our approach avoids 'staleness' by using machine learning algorithms to frequently recalculate the prediction of likely academic success for each student. For example, at the outset of a course our system relies entirely on pre-matriculation data sources (including socio-demographic variables and aptitude test scores). However, by the end of the first week (and each week thereafter) the model is recalculated using freshly available data from online lecture attendance data, assessment results (both summative and formative) and LMS Activity Data.

Our reliance on a suite of data, along with our ability to dynamically recalculate our prediction model has two key benefits over other systems:
1. Students that were not flagged as 'at risk' of poor academic performance based on pre-matriculation data alone can be identified quickly once the course is running as the model recalculates on a weekly cycle. Any 'at risk' students can then be offered an intervention in short order.
2. Students that were flagged as 'at risk' at the outset of the course (or early on) can be monitored on a weekly basis. Their most recent prediction of academic success can be compared to previous predictions. If no improvement is observed then further interventions or pastoral care can be provided.

This paper describes the process of training a supervised learning model that can predict academic success and then provides a performance evaluation of a number of predictive machine learning methods. Finally it provides a discussion and recommendations for suitable feature selection, training sets and learning algorithms suitable to solve the problem of prediction of academic success.
Keywords:
Academic performance, academic success, machine learning, intervention, learning approach, hybrid teaching.