DIGITAL LIBRARY
ANOMALY DETECTION IN LEARNING ANALYTICS: FACILITATING DOMAIN EXPERTS' SENSEMAKING OF HETEROGENEOUS STUDENT DATA
1 East Tennessee State University (UNITED STATES)
2 University of North Carolina, Charlotte (UNITED STATES)
About this paper:
Appears in: EDULEARN23 Proceedings
Publication year: 2023
Pages: 8094-8103
ISBN: 978-84-09-52151-7
ISSN: 2340-1117
doi: 10.21125/edulearn.2023.2095
Conference name: 15th International Conference on Education and New Learning Technologies
Dates: 3-5 July, 2023
Location: Palma, Spain
Abstract:
Learning Analytics (LA) is an emerging field that provides powerful tools for improving student learning and academic success. With the increasing amount of data generated by educational institutions, LA enables the use of advanced analytics to inform and support decision-making processes that can positively impact students' learning outcomes. This paper discusses the importance of using LA to support student learning and to detect those who might be at risk of not graduating on time. By analyzing student data, educational institutions can gain insights into the factors that affect student performance and use this information to identify students who may require additional support. Moreover, LA can help institutions to implement proactive interventions that can improve student engagement, motivation, and academic achievement, thereby increasing the likelihood of on-time graduation.

This paper proposes two unsupervised machine learning models that can be used to find unexpected patterns in student data. These unexpected patterns are presented to domain experts including academic advisors and administrators in a way that is both accessible and useful. The models aim to detect anomalies in student data and cluster them based on their behavior. Two models of anomaly detection - Point Anomaly Detection (PAD) and Collective Anomaly Detection (CAD) are developed to identify individual and collective anomalies, respectively. PAD aims to detect if an individual student's data instance can be considered anomalous when compared to the rest of the data. For example, if a student's GPA decreased significantly from one semester to another when compared to other students. CAD aims to detect if a collection of student data instances (not individual values) can be considered anomalous when compared to other students. For instance, if a student follows a non-typical pattern for the number of credits passed each semester.

The proposed models have the potential to provide valuable insights into student learning and identify potential risk factors that can affect students' academic performance. The models are designed to be flexible and adaptable to different types of data and learning contexts. They can be used by advisors and administrators to identify students who may require additional support and resources, and to improve the overall quality of education.

The proposed models are tested on a dataset consisting of 6203 students who enrolled over ten and a half years in a higher education institution, and the results show that the model can effectively detect anomalies in student data. This study highlights the potential of the proposed anomaly detection model and its impact on the domain experts' sensemaking of student data. Through four ethnographic studies, including focus group discussions, in-depth interviews, and diary studies using in-situ and snippet techniques, we have investigated the model's ability to improve and facilitate advisors' understanding of students' success or risk. By presenting abnormal patterns and behaviors, the proposed model offers a comprehensive and nuanced view of student data, enabling domain experts to identify potential issues and take timely action. Overall, this research contributes to the development of more effective learning analytics tools that can support student success and improve retention rates.
Keywords:
Unsupervised Machine Learning, Anomaly Detection, Learning Analytics, Point Anomaly Detection, Collective Anomaly Detection, Sensemaking.