DIGITAL LIBRARY
AUTOMATED GENERATION OF MANDARIN EDUCATIONAL MULTIMEDIA CONTENT FROM EXISTING ENGLISH CONTENT
Santa Clara University (UNITED STATES)
About this paper:
Appears in: ICERI2022 Proceedings
Publication year: 2022
Pages: 963-972
ISBN: 978-84-09-45476-1
ISSN: 2340-1095
doi: 10.21125/iceri.2022.0280
Conference name: 15th annual International Conference of Education, Research and Innovation
Dates: 7-9 November, 2022
Location: Seville, Spain
Abstract:
With the advancement of digital technology, multimedia contents have proliferated in recent years. Large collections of such multimedia content are accrued both in public on the internet and in private enterprise settings. However, the vast majority of freely available content is in English which places restrictions on its wider usage. This has motivated research techniques and technologies that facilitate reliable conversion of multimedia content into other languages. Initially, the research's focus was on the translation of planned spoken content structured into document-like units but gradually shifted focus to informal spoken content produced spontaneously in conversational settings.
The work focuses on automated translation of multimedia content, specifically video content, that are designed for formal education in Chinese contexts applying NLP and ML techniques. This paper demonstrates the feasibility of automating machine translations of English video content to Mandarin with an accuracy level of ninety percent. The results of our experiments show that translation of specialized content like educational materials require rich context-aware vocabulary which is not readily available now. This research enhances machine translation by building a pipeline for richer vocabulary datasets along with viable NLP/ML models.
Keywords:
Automated Machine Translation, AI in multimedia content, Speech Recognition System, Chinese Language Translation.