IDENTIFICATION OF VARIABLES ASSOCIATED WITH ACADEMIC SUCCESS OF HIGHER EDUCATION STUDENTS APPLYING DATA MINING
1 Universidad Autonoma de Sinaloa (MEXICO)
2 Universidad de Granma (CUBA)
About this paper:
Appears in: EDULEARN11 Proceedings
Publication year: 2011
Conference name: 3rd International Conference on Education and New Learning Technologies
Dates: 4-6 July, 2011
Location: Barcelona, Spain
Abstract:In this research we propose to use data collected from academics paths studies based on mathematical analysis and supported by software, to find viable information that offers the possibility of an effective selection of candidates to study in the college to avoid in this way desertion and increasing terminal efficiency rate. These rates at National wide (in Mexico) are very low. According to the Mexican Ministry of Public Education (SEP, Secretaría de Educación Pública) at the 2008 - 2009 scholastic cycle in higher education only the 52.7% of students finished their studies and the 32.3% obtained their degree. This means that over 47% of students dropped out at some point in their studies.
To obtain a better selection of students entering the different options offered by higher education in Mexico is one of the purposes of this study. Finding factors that influence the academic success of students and graduates through their academic life in college is the main object of our investigation.
In this research we apply one of the data mining techniques called decision trees, specifically the J48 algorithm using WEKA (Waikato Environment for Knowledge Analysis) software for the implementation of this algorithm in order to examine the statistics of academic paths that were extracted from a graduated group of higher education. Starting with a sample extracted from graduated students in computer science from the Computer Science Faculty of Mazatlan (University of Sinaloa), with the help of their overall performance and analyzing qualitative and quantitative factors such as their age, gender, high school average, parents’ education, family income, marital status, former high school, among others; and discarding those that, according to mathematical analysis, do not affect student performance. Obtaining a decision tree model using WEKA, which correctly classified 74.57% of instances analyzed for the training set, while for the test set obtained an effectiveness of 72.22%. Among the most important variables to explain the academic success include: the gender of the student, high school average and high school location.
We hope this research serve as a guide for selecting students to study in computer science and rational tool to predicting academic performance.
Keywords: Data Mining, Decision tree, Math Model, Higher Education, Academic Success, Academic Paths, Computer Science.