MINING ENROLMENT DATA FOR EARLY IDENTIFICATION OF AT-RISK STUDENTS
This study uses predictive modeling, one of the data mining techniques, to help identify at-risk students (which is defined as those whose cumulative grade-point average (CGPA) of <=2.3 out of a maximum attainable value of 5.0) in their first semester in SIM University (UniSIM) , which caters mainly to part-time adult learners. The objective of identifying these students early is to enable intervention to be effected as soon as the students are enrolled and this could be before they even embark on their studies in their first semester.
11 variables from the enrolment database were considered as possible factors to the predictive model, which can be divided broadly into demographic variables (eg. age, gender, years of working experience), pre-UniSIM variables (e.g. polytechnic graduated from, polytechnic CGPA, ‘O’ level Math and English grades) and UniSIM variables (e.g. School/Discpline enrolled in UniSIM). To classify the at-risk students, various algorithms were used e.g. neural network, and decision tree. The performance of the various models were compared using sensitivity, specificity and accuracy and the chosen model is a decision tree model that was also able to inform on policy. The chosen decision tree model identified the following factors: polytechnic that the student graduated from, polytechnic CGPA, O’ level Math and English grades, the School that the student is enrolled in and the years since the student graduated from polytechnic. The implications of these results for identification of early intervention are discussed.