About this paper

Appears in:
Pages: 3040-3049
Publication year: 2009
ISBN: 978-84-612-7578-6
ISSN: 2340-1079

Conference name: 3rd International Technology, Education and Development Conference
Dates: 9-11 March, 2009
Location: Valencia, Spain

PREDICTING FOREIGN LANGUAGE LEARNERS’ READING PROFICIENCY BASED ON READING TIME AND TEXT COMPLEXITY

K. Kotani1, T. Yoshimi2, T. Kutsumi3, I. Sata3, H. Isahara4

1Kansai Gaidai University (JAPAN)
2Ryukoku University (JAPAN)
3Sharp Corporation (JAPAN)
4National Institute of Information and Communications Technology (JAPAN)
Supporting evaluation of language proficiency is an advantage of computer-assisted language learning. As foreign language learners vary greatly in their proficiency, it is important that a teacher finds what problems each learner has. In a reading class, a teacher can evaluate reading proficiency with comprehension questions in a reading textbook. Because there are no questions in authentic texts, a teacher has to prepare them if authentic texts are used as reading materials. Evaluation of learners’ comprehension therefore would put heavy burden on teachers.

Natural language processing technology can assist teachers in evaluating learners’ reading proficiency, because it provides an evaluation method that does not rely solely on comprehension questions. We propose Reading Proficiency Model (RPM) that computes learners’ reading proficiency in terms of a score on the Test of English for International Communication (TOEIC). We constructed RPM with a regression, taking a TOEIC score as the dependent variable and linguistic properties of a text and a learner’s reading time as the independent variables.

Linguistic properties refer to text complexity arising from lexical, syntactic and discourse properties of a text. Lexical difficulty is measured with a morpho-lexical analyzer[4]. Syntactic complexity is derived with a syntactic parser[3], which produces a syntactic tree of an input text. Discourse complexity is defined with the number of anaphoric expressions. Reading time data was collected from 64 learners of English as a foreign language (EFL) who reported their TOEIC scores. Each learner read 7 or 14 texts selected from a TOEIC textbook. As a result, 451 instances of text reading time data were obtained.

RPM was constructed with 361 instances as training data for a regression by Support Vector Machines (well known machine learning algorithms that have high generalization performance), and verified with the remaining 90 instances. RPM marked an error rate of 17.5% in our experiment. We further examined our model by comparing other RPMs (N-model and S-model) that employed linguistic features proposed by previous studies[1], [2]. N-model was developed based on lexical items in particular constructions such as a relative clause[1]. S-model was constructed based on syntactic features and lexical features such as the height of a syntactic tree and the number of conjunctions[2]. Error rate of these models were 18.7% for N-model and 18.4% for S-model. From viewpoint of error reduction rate, the error rate of our model is lower than that of N-model by 4.9% (=(18.7-17.5)/18.7*100) and that of S-model by 6.4% (=(18.4-17.5)/18.4*100).

From these experiment results, we conclude that our RPM can contribute to assisting teachers in evaluating EFL learners’ reading proficiency.

References
[1] Nagata, R., et al. 2002. A method of rating English reading skill automatically: Rating English reading skill using reading speed. Computer & Education, Vol. 12. 99-103.
[2] Schwarm, S. E. et al. 2005. Reading level assessment using support vector machines and statistical language models. Proc. of the 43rd Annual Meeting of the Association for Computational Linguistics. 523-530.
[3] Sekine, S. et al. 1995. A corpus-based probabilistic grammar with only two non-terminals. Proc. of the 4th International Workshop on Parsing Technologies. 216-223.
[4] Someya, Y. 2000. Word Level Checker: Vocabulary Profiling Program by AWK, Ver. 1.5.
@InProceedings{KOTANI2009PRE,
author = {Kotani, K. and Yoshimi, T. and Kutsumi, T. and Sata, I. and Isahara, H.},
title = {PREDICTING FOREIGN LANGUAGE LEARNERS’ READING PROFICIENCY BASED ON READING TIME AND TEXT COMPLEXITY },
series = {3rd International Technology, Education and Development Conference},
booktitle = {INTED2009 Proceedings},
isbn = {978-84-612-7578-6},
issn = {2340-1079},
publisher = {IATED},
location = {Valencia, Spain},
month = {9-11 March, 2009},
year = {2009},
pages = {3040-3049}}
TY - CONF
AU - K. Kotani AU - T. Yoshimi AU - T. Kutsumi AU - I. Sata AU - H. Isahara
TI - PREDICTING FOREIGN LANGUAGE LEARNERS’ READING PROFICIENCY BASED ON READING TIME AND TEXT COMPLEXITY
SN - 978-84-612-7578-6/2340-1079
PY - 2009
Y1 - 9-11 March, 2009
CI - Valencia, Spain
JO - 3rd International Technology, Education and Development Conference
JA - INTED2009 Proceedings
SP - 3040
EP - 3049
ER -
K. Kotani, T. Yoshimi, T. Kutsumi, I. Sata, H. Isahara (2009) PREDICTING FOREIGN LANGUAGE LEARNERS’ READING PROFICIENCY BASED ON READING TIME AND TEXT COMPLEXITY , INTED2009 Proceedings, pp. 3040-3049.
User:
Pass: