IMPROVING THE SELECTION PROCESS OF STUDENTS IN HIGHER EDUCATION BASED ON DATA WAREHOUSE AND DATA MINING TECHNIQUES

M. Qbadou, M. Hajji, A. Samadi, K. Mansouri

Hassan II University (MOROCCO)
When selecting candidates for engineering schools, officials are faced with the issue of students’ heterogeneity in their academic scores. This problem is due to many subjective parameters that are often not significant for process efficiency. Among these parameters, the type of institution (Faculty of Science, Faculty of Science and Technology, preparatory classes, higher technology schools, etc.), the type of degree (DEUG, DEUP, DEUST, DTS, Vocational, etc.), the geographical origin of the institution and many others. To remedy this problem, officials apply weightings approximate to harmonize marks. Despite the respect of the selection criteria, there are significant differences between the selected candidates according to their academic achievement and their score in the entrance exam. Indeed, the estimation of weighting coefficients is based on a manual and partial analysis, and can not give rise to a general analytical model capable of processing large amounts of data over several years and for all candidates.

The aim of our contribution is to offer a digital solution based on Data Warehousing and Data Mining techniques to automate the analytical process of data collected on candidates. The aim is to extract relevant performance indicators (KPI) whose analysis by a general inductive method helps find the relationships between the various parameters related to the scores of the candidates. These relationships are then programmed in order to determine the best weighting coefficient to minimize deviations.

The analysis of indicators and calculating the weighting coefficient go through five main stages: (1) Identification of related business requirements in the selection process and that of the synthesis of the results of the entrance exam. (2) The acquisition of data on the candidates, the selection process and the entrance exam. (3) The design and implementation of a data warehouse for storing data and metadata. (4) The design and planning of the execution of ETL process for loading data into the warehouse. (5) the presentation of the results in the form of dashboards based on OLAP technology and the calculation of weighting coefficients based on the techniques of Data Mining.

The model developed has significantly reduced the differences with respect to the entrance exam results by predicting the characteristics of the most suitable candidates that are likely to pass. The results obtained encourage extending our model to the design process of the entrance exam to ensure better selectivity.