DIGITAL LIBRARY
MULTIMODAL CONTEXTUALIZING AND TARGETING EXERCISES IN ICAPT SYSTEMS
1 University of Aizu (JAPAN)
2 Eyes, JAPAN Co. Ltd (JAPAN)
3 Speech Technology Center (RUSSIAN FEDERATION)
4 Peter the Great St. Petersburg Polytechnic University (RUSSIAN FEDERATION)
About this paper:
Appears in: INTED2024 Proceedings
Publication year: 2024
Pages: 438-448
ISBN: 978-84-09-59215-9
ISSN: 2340-1079
doi: 10.21125/inted.2024.0164
Conference name: 18th International Technology, Education and Development Conference
Dates: 4-6 March, 2024
Location: Valencia, Spain
Abstract:
Intelligent computer-assisted pronunciation training (iCAPT) systems are result of their development and maturing about 50 years starting from straightforward implementations providing digitized access to traditional learning materials such as textbooks or audio and video recordings, up to contemporary technology-enforced AI and NLP driven solutions in scope of digitalization paradigm. These solutions involve models of providing better tailored CAPT feedback to learners through the complex combination of descriptive, evaluative, instructive, and actionable interfaces and through access to web and mobile platforms. Novel features delivered by present-day intelligent teaching and learning instruments are grounded on the models and algorithms coming from wide range of connected but relatively independent areas including signal and speech processing algorithms, techniques of spoken language visualization, machine learning, accent recognition and classification, to cite a few. Furthermore, knowledge-based intelligence of such instruments assumes active use of metaphors, visual, audial and even kinesthetic models taken from non-technical domains such as art, music, fitness, or even choreography. Cross-disciplinary design of present-day multimodal interfaces of iCAPT systems creates unique possibilities for exercise contextualization, targeting, and personalization depending on the known learners’ language background, the predictions made through dynamic assessment of learning progress, but also on the learners’ areas of interest out of scope of language study. This research discusses the lessons learned and knowledge acquired in process of assessment of StudyIntonation CAPT system, which is a research implementation of a prosody-centric multi-language learning environment originally constructed on the grounds of CAPT feedback generation using interactive pitch contours displayed in a mobile app along with numerical scores of pitch quality, but further extended with a number of features and interfaces aiming at improving its multimodal tailored feedback according to learner preferences, their background, and achievements. The experience we get is contrasted against a number of other successful state-of-the-art iCAPT systems addressing spoken language learning contextualization and targeting.
Keywords:
Computer-assisted prosody training, L2 education, mobile technology, multimodal interface, tailored feedback, speech processing, learning personalization.