About this paper

Appears in:
Pages: 5269-5279
Publication year: 2016
ISBN: 978-84-608-8860-4
ISSN: 2340-1117
doi: 10.21125/edulearn.2016.2241

Conference name: 8th International Conference on Education and New Learning Technologies
Dates: 4-6 July, 2016
Location: Barcelona, Spain


L. Allen1, M. Dascalu2, D. McNamara1, S. Crossley3, S. Trausan-Matu2

1Institute for the Science of Teaching & Learning, Arizona State University (UNITED STATES)
2University Politehnica of Bucharest (ROMANIA)
3Georgia State University, Department of Applied Linguistics/ESL (UNITED STATES)
The development of strong writing skills is not a simple task, and requires thorough instruction and opportunities for deliberate practice. Unfortunately, deliberate practice relies on the delivery of personalized feedback, which places significant demands on teachers’ time and reduces the number of essays that students can write.

In response to these issues, computer-based writing systems have been developed to provide students with automated feedback on their writing. The essay scores delivered by these automated writing evaluation (AWE) systems have been shown to be highly accurate; however, critics frequently note that they rarely consider personal information related to the users. Therefore, researchers have recently begun to model information that goes beyond essay scores, such as students’ affective states and working memory capacity.

The purpose of the current study is to investigate the degree to which linguistic properties of students’ essays can be used to model individual differences in their vocabulary knowledge and comprehension skills. We calculated linguistic essay features using our framework, ReaderBench, which is an automated text analysis tool that calculates linguistic and rhetorical text indices. The overarching goal of this research is to inform student models in AWE systems, which can help to improve the personalization of their feedback and instruction.

University students (n = 108) produced timed (25 minutes), argumentative essays in response to a standardized test prompt. Additionally, they completed the Gates-MacGinitie Vocabulary and Reading Comprehension tests.

Pearson correlations were calculated between students’ scores on the vocabulary and comprehension measures and the linguistic properties of their essays. The indices that demonstrated a significant correlation with vocabulary knowledge and comprehension scores (p < .05) were retained in the analysis. Next, multicollinearity of the indices was assessed (r > .70). When indices demonstrated multicollinearity, the index that correlated most strongly with vocabulary knowledge and comprehension scores was retained.

To determine whether individual differences in vocabulary and comprehension scores manifested in the properties of essays, two stepwise regression analyses were calculated. The results of the vocabulary analysis indicated that five indices were able to account for 45.3% of the variance in vocabulary scores [F(5, 102) = 16.927, p < .001; R2 = .453]: average word syllable count, word entropy, average concession connectives per paragraph, average word polysemy count, and average cohesion in-between the sentences of each paragraph (measured via Latent Semantic Analysis). Similarly, the first three of these indices accounted for 36.3% of the variance in comprehension scores [F(3, 104) = 19.758, p < .001; R2 = .363]. These analyses confirmed that the characteristics of students’ essays were, indeed, related to their vocabulary and comprehension skills.

Our results suggest that natural language processing techniques can help to inform models of individual differences among student writers. Importantly, our framework, ReaderBench, supports multi-lingual text analyses, currently covering English, French, Spanish and Romanian languages. Therefore, the results of these analyses can potentially be used to inform computer-based writing systems in a variety of different languages to provide more personalized and adaptive instruction.
author = {Allen, L. and Dascalu, M. and McNamara, D. and Crossley, S. and Trausan-Matu, S.},
series = {8th International Conference on Education and New Learning Technologies},
booktitle = {EDULEARN16 Proceedings},
isbn = {978-84-608-8860-4},
issn = {2340-1117},
doi = {10.21125/edulearn.2016.2241},
url = {https://dx.doi.org/10.21125/edulearn.2016.2241},
publisher = {IATED},
location = {Barcelona, Spain},
month = {4-6 July, 2016},
year = {2016},
pages = {5269-5279}}
AU - L. Allen AU - M. Dascalu AU - D. McNamara AU - S. Crossley AU - S. Trausan-Matu
SN - 978-84-608-8860-4/2340-1117
DO - 10.21125/edulearn.2016.2241
PY - 2016
Y1 - 4-6 July, 2016
CI - Barcelona, Spain
JO - 8th International Conference on Education and New Learning Technologies
JA - EDULEARN16 Proceedings
SP - 5269
EP - 5279
ER -
L. Allen, M. Dascalu, D. McNamara, S. Crossley, S. Trausan-Matu (2016) MODELING INDIVIDUAL DIFFERENCES AMONG WRITERS USING READERBENCH, EDULEARN16 Proceedings, pp. 5269-5279.