DATA SCIENCE EXPERIMENTAL LAB DEVELOPMENT

L. Gotsev1, K. Rasheva-Yordanova1, S. Siarova1, A. Peshev2, I. Shopova1, B. Jekov1, E. Kovatcheva 1, G. Dimitrov1, R. Nikolov1, E. Shoikova1

1University of Library Studies and Information Technologies (BULGARIA)
2Silver Star Retail - Mercedes-Benz (BULGARIA)
Big Data and AI enabled systems are helping to accelerate the digital transformation where they are often deployed hand in hand to deliver key insights by enabling advanced data analytics. These technologies are evolving rapidly and being used in a growing number of industries, from financial and healthcare to smart manufacturing, intelligent transport systems and self-driving vehicles. Today, as education systems are currently undergoing significant change brought about by emerging reform in pedagogy and technology, our efforts have sought to close the gap between technologies as an educational additive to effective integration as a means to promote and cultivate student centred, experimental research based learning. Although there is evidence of essential integration of digital technologies and Big data-driven experimental research in IT training and education, they are still regarded as optional in many cases. Better articulation of how digital technologies can support improved learning outcomes is required.

This paper discusses a vision and steps toward development of research environment within the framework of the Center of Excellence in Informatics and ICT. Our university implements scientific research and educational programs to facilitate capacity building, particularly in Data Science. The aim of the new experimental lab is to provide learners and educators with access to world-class experimentation facilities and high-quality learning materials. This is expected to break the boundaries of traditional learning and enable the detection of the learner’s context and level of knowledge and skills leading to significantly better learning experiences.

The purpose of this paper is to overview a foundational framework for Data Science experimental lab development and provide models of innovative learning scenario in Data Science Master degree program. Introduction is focused on analysis of emerging types of jobs related to Big Data that are requiring future personnel to be well equipped to meet the business needs. Section 2 introduces the overview of the Big Data concepts meanings as well as the differences between Big Data and Data Science based mainly on the NIST Big Data Interoperability Framework. Section 3 looks at a Data Science process along with the implications of the study for Big Data Scientist competence profile. Section 4 presents the results of a systematic review about the top widely used Data Science platforms and machine learning software with open source. The methodology for comparative analysis and selection of the infrastructure components (Hadoop, WEKA, Rapid Miner, Orange, and KNIME) is presented with relevant discussions. As an illustration of experiential learning on Data Science, Section 5 presents the case studies of two learning scenaria entitled ”Data Science process”and “Comparative analysis of open source Big Data tools “. The defined business and learning objectives are focused on deeper understanding of a fast, efficient, reusable and scalable approach to apply Data Science and machine learning to a problem statement of interest ranging from experiments to production. The present study proves the power of our approach for building remote labs and delivering them to students for improved learning outcomes.