DIGITAL LIBRARY
ANALYZING SURVEY DATA - USING CLASSIFICATION RESULTS FOR BETTER DATA UNDERSTANDING
Worcester State University (UNITED STATES)
About this paper:
Appears in: ICERI2023 Proceedings
Publication year: 2023
Pages: 9145-9148
ISBN: 978-84-09-55942-8
ISSN: 2340-1095
doi: 10.21125/iceri.2023.2346
Conference name: 16th annual International Conference of Education, Research and Innovation
Dates: 13-15 November, 2023
Location: Seville, Spain
Abstract:
The project was done under the Worcester State University STEM Summer research program. The goal of this project is to analyze the youth risk behavior survey data. The dataset contains results of high school surveys conducted nationwide (USA), in multiple states and districts from 1991 - 2019. We focused on some of the USA states (New England states) due to the huge size of the initial dataset. The dataset was cleaned and preprocessed to prepare it for the further analysis.
The major goal of this project was to study youth risk behavior of high school students in New England and analyze its patterns. The dataset was analyzed using a variety of Visualization techniques, Statistical Analysis methods and well as Machine Learning methods. We studied whether there are differences in different risk behavior patterns between different race groups, age groups, whether there is any correlation between different risk behavior patterns. To address most of the questions we used a variety of Classification methods with different testing options. We have shown that even Classification models with low accuracy can be used to discover interesting patterns and answer the questions we asked. The project illustrates how survey data analysis problems can be used for Undergraduate research in Interdisciplinary fields such as Health, Sociology, Criminal Justice, and many other, involving students from the areas of the given surveys and CS students. The problems like this can be scaled down and used in a variety of Data Analysis related courses at different levels where students experimenting learn different data visualization techniques, statistical analysis methods as well as Machine Learning approaches. Different steps of the project like this one might be included as a coursework into Data Science related courses to improve students critical thinking and problem solving skills and introduce and expose them to their first research experience.
Keywords:
Data Analysis, Classification, STEM Undergraduate Research, Critical Thinking and Problem Solving.