1 Institute for the Science of Teaching & Learning, Arizona State University (UNITED STATES)
2 University Politehnica of Bucharest (ROMANIA)
3 Georgia State University, Department of Applied Linguistics/ESL (UNITED STATES)
About this paper:
Appears in: EDULEARN16 Proceedings
Publication year: 2016
Pages: 5269-5279
ISBN: 978-84-608-8860-4
ISSN: 2340-1117
doi: 10.21125/edulearn.2016.2241
Conference name: 8th International Conference on Education and New Learning Technologies
Dates: 4-6 July, 2016
Location: Barcelona, Spain
The development of strong writing skills is not a simple task, and requires thorough instruction and opportunities for deliberate practice. Unfortunately, deliberate practice relies on the delivery of personalized feedback, which places significant demands on teachers’ time and reduces the number of essays that students can write.

In response to these issues, computer-based writing systems have been developed to provide students with automated feedback on their writing. The essay scores delivered by these automated writing evaluation (AWE) systems have been shown to be highly accurate; however, critics frequently note that they rarely consider personal information related to the users. Therefore, researchers have recently begun to model information that goes beyond essay scores, such as students’ affective states and working memory capacity.

The purpose of the current study is to investigate the degree to which linguistic properties of students’ essays can be used to model individual differences in their vocabulary knowledge and comprehension skills. We calculated linguistic essay features using our framework, ReaderBench, which is an automated text analysis tool that calculates linguistic and rhetorical text indices. The overarching goal of this research is to inform student models in AWE systems, which can help to improve the personalization of their feedback and instruction.

University students (n = 108) produced timed (25 minutes), argumentative essays in response to a standardized test prompt. Additionally, they completed the Gates-MacGinitie Vocabulary and Reading Comprehension tests.

Pearson correlations were calculated between students’ scores on the vocabulary and comprehension measures and the linguistic properties of their essays. The indices that demonstrated a significant correlation with vocabulary knowledge and comprehension scores (p < .05) were retained in the analysis. Next, multicollinearity of the indices was assessed (r > .70). When indices demonstrated multicollinearity, the index that correlated most strongly with vocabulary knowledge and comprehension scores was retained.

To determine whether individual differences in vocabulary and comprehension scores manifested in the properties of essays, two stepwise regression analyses were calculated. The results of the vocabulary analysis indicated that five indices were able to account for 45.3% of the variance in vocabulary scores [F(5, 102) = 16.927, p < .001; R2 = .453]: average word syllable count, word entropy, average concession connectives per paragraph, average word polysemy count, and average cohesion in-between the sentences of each paragraph (measured via Latent Semantic Analysis). Similarly, the first three of these indices accounted for 36.3% of the variance in comprehension scores [F(3, 104) = 19.758, p < .001; R2 = .363]. These analyses confirmed that the characteristics of students’ essays were, indeed, related to their vocabulary and comprehension skills.

Our results suggest that natural language processing techniques can help to inform models of individual differences among student writers. Importantly, our framework, ReaderBench, supports multi-lingual text analyses, currently covering English, French, Spanish and Romanian languages. Therefore, the results of these analyses can potentially be used to inform computer-based writing systems in a variety of different languages to provide more personalized and adaptive instruction.
Automated essay grading, textual complexity, comprehension prediction.