C. Moret-Tatay , A. Cloquell , M. Pérez-Bermejo , F. Arteaga 

OAMI-UCV. Universidad Católica de Valencia San Vicente Mártir (SPAIN)
The interest in computerized analysis of spoken or written language through Natural Language Processing (NLP) has grown in the last decade. Language analysis in educational contexts could quantify many aspects of language that are not explored by conventional language tests, potentially applicable to broad segments of the population with low-cost tools. Having established this connection, NLP could monitor spontaneous speech problems performed and, in turn, reveal alterations in language performance in early cognition. Moreover, it should be noted that language is a cognitive process underlying other processes such as executive functions and memory, which could also be assessed through this pathway.

Mood disorders, which is the main variable of interest in the current study, are common in the young to University students that sometimes remain undercover. In this way and due to their serious consequences, mood disorders at this age can be considered an important public health problem. Knowledge of these early symptoms has major clinical implications to achieve effective treatment of mood disorders that can be addressed through the analysis of language. Thus, the aim of this work was to examine the mood of university students through NLP after the Covid-19 outbreak.

Participants were recruited through snowball sampling. A total of thirty university students volunteered to participate in the study. Participants completed semi-structured qualitative interviews regarding their mood. Using the Orange tool for data mining, the main ideas of the text are analyzed.

Responses were examined in terms of lexical content, spatial relationships and ontological organization that relates such expressions to general classes of fixed semantic import. The current results suggest that participants who referred worse mood (by qualitative assessments) had longer responses with greater expression of sadness and loneliness. Moreover, differences between men and women were found, being women more likely to endorse feeling lonely during the qualitative interview. It is considered that the current results can provide unique insights into how linguistic features of transcribed speech data may reflect in University students.

These results are of interest at both theoretical and applied levels. On the one hand, they allow the implementation of existing models of mood in university students. On the other, the information they provide allows us to know the process of an outcome, being NLP an optimal tool to reach this propose. As has been shown, big data is inconceivable without some mechanism capable of filtering, organizing, and processing the immense amounts of information that exist.

Some limitations of the study must be pointed out. First, this is a pilot study with a sample of thirty participants, which may limit the generalizability of the results. In any case, these results were of interest for future direct and systematic replications. Moreover, the many levels of ambiguity pose a significant challenge to researchers developing NLP. For this propose, future lines of research should include underlying markers of spoken language. Specifically, language analysis in clinical contexts may quantify many aspects of language related to segmental and suprasegmental levels, such as prosody and rhythm, that are not explored in the current study, nor by conventional language tests.