VOCABULARY RICHNESS GRAPHS SERVING LANGUAGE ACQUISITION
University of Debrecen (HUNGARY)
About this paper:
Appears in: ICERI2012 Proceedings
Publication year: 2012
Conference name: 5th International Conference of Education, Research and Innovation
Dates: 19-21 November, 2012
Location: Madrid, Spain
Abstract:Vocabulary richness graphs (VRG) are computer-generated statistical functions which represent the frequency distribution of newly introduced words on the sequence of fixed-length segments of written texts. The main features of VRG are as follows:
- They show the number (and location) of newly introduced words so the vocabulary changes of selected texts can be traced.
- VRGs show a representative pattern of the vocabulary and to some extent the stylistic structure of the analysed texts.
- Since VRGs are unique to each text, they might as well serve as identification graphs, a kind of “fingerprints” to written texts.
The protuberances, the local maximums of the graphs represent those segments of the texts which contain more vocabulary items than it is expected by a first-order statistical model. In turn, the inverted protuberances, i.e. the local minimums of the graphs, can be interpreted in two ways: on the one hand, at the beginning of the texts they refer to the overestimation of the number of vocabulary items which can be expected on the basis of the statistical model; on the other hand, they refer to those segments which carry less vocabulary items than can be expected.
In our project selected literary works (mainly novels, concatenated short stories and their different adaptations including translations and condensed versions) and textbooks for second language acquisition have been analyzed. We have generated unique VRG in each case and used them to explore the vocabulary changes of analysed works often in comparison with the different adaptations and versions if they were available.
Using VRG different applications are possible, especially if the potential users can generate them interactively (e.g. via the internet) for selected texts.
In foreign language translations, in terms of vocabulary structure and distribution those texts could be considered reliable which follow the vocabulary pattern of the original work. This means that viewing and analyzing the VRG of the original work and its different translations we can obtain one important and exact measure of reliability of translation.
Considering adaptations in general the authors of these adaptations could use the VRG of the adaptation in question along the whole adaptation process by comparing them to the VRG of the original work. Depending on whether the adaptation follow the original pattern or not, further modifications can be applied to the current adaption before finalizing (and publishing) it.
VRG can also be built for textbooks for second language acquisition. These graphs would reveal how equally (or gradually) the newly introduced words are dispersed in selected textbooks. Based on VRG teachers can develop different teaching and learning strategies or they could select those textbooks which would suit their students’ vocabulary development, language skills, etc. the best. In the latter case, if teachers want to select texts which carry a lot of segments with rich vocabulary, they have to look for texts whose VRG show wide and intensive protuberances. On the other hand, shallow vocabulary richness graphs refer to texts which reuse words frequently (which means that they are relatively easy for students to process and learn). Therefore teachers of second language and teachers of students in need of special care would greatly benefit from using VRG especially if they want to lighten the reading burden on students, for special reasons.
Keywords: Vocabulary richness, language acquisition.