DIGITAL LIBRARY
ANALYZING LARGE SETS OF UNSTRUCTURED DATA: CASE STUDIES OF USING NATURAL LANGUAGE PROCESSING IN RESEARCH
Mount St. Joseph University (UNITED STATES)
About this paper:
Appears in: ICERI2020 Proceedings
Publication year: 2020
Pages: 5062-5070
ISBN: 978-84-09-24232-0
ISSN: 2340-1095
doi: 10.21125/iceri.2020.1099
Conference name: 13th annual International Conference of Education, Research and Innovation
Dates: 9-10 November, 2020
Location: Online Conference
Abstract:
Natural Language Processing (NLP) is a fusion between linguistics and computer science that allows for the computerized sorting and categorization of large sets of unstructured textual data from sources such as social media, learning management systems, websites, open-ended survey questions, and transcribed interviews. Natural language processing has certain advantages over traditional labor-intensive approaches to qualitative data: natural language processing is unbiased and can process large data sets that would otherwise be unmanageable if analyzed traditionally, by hand.

In this presentation, we will give a short overview of NLP and its potential to expand and reimagine research. We will provide concise case studies of how our research group has used NLP and related techniques to:
1) predict student drop out based in the second year of a continuing study of text messages from a closed social network, and
2) determine the needs of pregnant patients in a midwifery practice.

Researchers who attend this presentation can anticipate gaining a high-level understanding of NLP and beginning to envision potential applications within their own fields.
Keywords:
Natural Language Processing, higher education, student retention, infant mortality, health education.