DIGITAL LIBRARY
A PROPOSAL FOR PREDICTING ACADEMIC ACHIEVEMENT THROUGH A VECTORIAL MODEL OF LEXICAL AVAILABILITY
University of Concepción (CHILE)
About this paper:
Appears in: ICERI2016 Proceedings
Publication year: 2016
Pages: 2799-2806
ISBN: 978-84-617-5895-1
ISSN: 2340-1095
doi: 10.21125/iceri.2016.1603
Conference name: 9th annual International Conference of Education, Research and Innovation
Dates: 14-16 November, 2016
Location: Seville, Spain
Abstract:
Lexical availability corresponds to the set of words that a group of people tend to use more regularly to a particular communicative situation. Early studies of lexical availability were carried out by UNESCO in the fifties to facilitate the learning of the French language to non-native inhabitants of France. One of the most common procedures to obtain the lexical availability is to gather the mental lexicon of people, asking each individual to write in order a list of words that first come to mind indicating only a particular topic called Center of interest. In this context, there are numerous studies of analysis of lexical availability and its relationship with the socio-cultural characteristics in areas of interest related to education, where quantitative analysis carried out, it is done primarily representing the results using numerical indexes of the characteristics of the obtained lexicons groups, such as average number of words per lexicon, number of different words, cohesion index (a measure of the homogeneity of the answers) and index of lexical availability. The latter indicates how available is a word related to a center of interest.

The aim of this work is the proposal of a vector representation model of the obtained mental lexicons of studies of lexical availability that enables the use of techniques of pattern recognition to predict academic performance of teaching students from two Chilean universities in topics related to mathematics teaching.

The study group for the analysis of lexical availability corresponds to teaching programs students of Mathematics and IT at the University of Concepcion and students of the Mathematics Teaching Program from the University of the Bío Bío, considering in both cases students from first to fifth year. Selected centers of interest correspond to the five thematic groupings determined by the Ministry of Education of Chile for teaching mathematics in secondary education: Probability, Number Systems, Calculus, Algebraic Structures and Geometry.

The proposed model uses a matrix representation of words frequencies through weighting functions based on the rate of lexical availability, using the Principal Component Analysis for dimensional reduction and visualization of results. The prediction is performed using a Naive Bayes classifier which allows, given the mental lexicon of students, to predict whether the student is high performance (when it has a high grade) or low performance (when it has a low grade).

The results allow visualizing the spatial distribution of mental lexicons of students on a center of interest, showing that there is a noticeable difference in lexical availability of students according to the number of years in university, university to which they belong and grades. On the other hand, it was possible through the proposed model to predict the performance of students with accuracy over 68%.

Finally, from this research we can conclude that it is possible to use a vector representation model as a complementary analysis tool for studies of lexical availability in the field of education, allowing quantitative analysis using techniques of pattern recognition to visualize and predict the academic performance of students using the students’ mental lexicon as a predictor. This prediction can be used as a tool for decision-making and remedial design in the field of education.
Keywords:
Academic Performance, PCA, Vector Model, Naive Bayes, Lexical availability.