DIGITAL LIBRARY
COMPLEX DATA ANALYSIS IN PREDICTING STUDENT SUCCESS DURING IN-HOUSE VOCATIONAL TRAINING OF TECHNICAL PERSONNEL
1 Technical School of St. Petersburg Metro (RUSSIAN FEDERATION)
2 ITMO University (RUSSIAN FEDERATION)
3 The Herzen State Pedagogical University of Russia (RUSSIAN FEDERATION)
About this paper:
Appears in: INTED2021 Proceedings
Publication year: 2021
Pages: 6525-6534
ISBN: 978-84-09-27666-0
ISSN: 2340-1079
doi: 10.21125/inted.2021.1304
Conference name: 15th International Technology, Education and Development Conference
Dates: 8-9 March, 2021
Location: Online Conference
Abstract:
Introduction:
Public transport operators, especially subway operators, are of great demand in large cities all over the world. This specialty does not require a university education and does not provide great career prospects, and therefore is not considered prestigious. The authorities are forced to financially stimulate a constant influx of students, which can attract "random" people who are not focused on productive learning and are not capable of it. The identification of such people at an early stage of education, and helping students to assess their own professional prospects is an important problem, both financially and socially.

Background and problem statement:
As a rule, vocational training of subway operators is organized in the form of short-term (up to 6 month) on-the-job training including group classes and then practical training with the personal instructor directly at the workplace. Besides, candidates are tested for professional aptitude by entrance psychological test. The students groups are heterogeneous in terms of previous education and professional experience, but rather small (up to 10 people), which allows the teacher leading the group classes, and even more so the personal instructor, to get a fairly detailed idea of the psychological characteristics of each student.

Thus, to identify the educational and professional prospects of each student, there are several sources of information, including: the student's background information, the results of the input professional aptitude test, the academic performance in the form of current points and grade point average, as well as the expert opinion of the teacher of theoretical disciplines and personal instructor. These sources are currently not coordinated, in many respects contradict each other and are not used in a complex manner; the specified information is of different types, contains both an objectively measurable and subjectively determined component, and the cohorts of students under consideration are small and statistically non-uniform. All this makes it difficult to apply the statistical or machine learning methods directly.

The objective of the article is to build a comprehensive model for identifying and evaluating the parameters that are most significant for predicting the success of students through the intellectual analysis of data on vocational training of public transit operators. The work is carried out on the example of the specialty "Metro drivers", the city of St. Petersburg, Russia.

Research Methods:
To assess objectively measurable data, data about students in this specialty for the period 2015-2019 (more than 1200 entries, 22 parameters) were collected and preprocessed. To assess subjectively measurable data, a personal questionnaire NEO-PI-R was used (5 parameters, 22 experts). To reduce the number of parameters, methods of dimensionality reduction were used. To rank the significance of the parameters, logistic regression methods were used.

Results and discussion:
The resulting model revealed the most significant of the objectively and subjectively determined parameters for assessing the success of students. As preliminary results, it, in particular, allows us to predict the current dropout rate of students in the learning process with F1 = 0.82, which is a rather good evaluation level of prediction.
Keywords:
Vocational training, predictors of student success, metro drivers, methods of dimensionality reduction.