DIGITAL LIBRARY
THE CONVERSION FROM SIGN LANGUAGE TO EMOTIONAL SPEECH
Northwest Normal University (CHINA)
About this paper:
Appears in: INTED2020 Proceedings
Publication year: 2020
Pages: 1772-1779
ISBN: 978-84-09-17939-8
ISSN: 2340-1079
doi: 10.21125/inted.2020.0565
Conference name: 14th International Technology, Education and Development Conference
Dates: 2-4 March, 2020
Location: Valencia, Spain
Abstract:
This paper proposes a method to convert sign language to emotional speech for dealing with the problem of communication between language barriers and other people. We firstly use the deep neural network to extract the gesture features from 30 kinds of different gesture images. Meanwhile, we obtain facial features from six kinds of different facial expression images with a convolutional neural network. Then we adopt a support vector machine to recognize gesture categories and emotional labels. We generate the context-dependent labels from the text of sign language obtained from the recognized gesture information. We also develop a deep neural network-based Mandarin emotional speech synthesis by using speaker adaptive training. Mandarin's emotional speech is synthesized by using context-dependent labels and emotional labels. The experiment results show that the accuracy of static sign language recognition and facial expression recognition is 94.5% and 96.3%, respectively. The average mean opinion score of the synthesized Mandarin emotional speech is 4.3, and the accuracy of the sentences from sign language to emotional speech is 92.8%. Further work will apply the method proposed in this paper to teach language barriers instead of the traditional mouse and keyboard. The proposed method can achieve barrier-free communication in the two-way communication between teaching and learning and to improve the teaching quality of the language barrier.
Keywords:
Deep learning, gesture recognition, emotional speech synthesis, computer-aided instruction.