COMPARISONS OF VISUAL FEATURES EXTRACTION TOWARDS AUTOMATIC LIP READING
University of Pavia (ITALY)
About this paper:
Appears in:
EDULEARN13 Proceedings
Publication year: 2013
Pages: 2188-2196
ISBN: 978-84-616-3822-2
ISSN: 2340-1117
Conference name: 5th International Conference on Education and New Learning Technologies
Dates: 1-3 July, 2013
Location: Barcelona, Spain
Abstract:
The human face is a dynamic object and has a high degree of variability in its appearance, which makes feature extraction a difficult problem in computer vision. A wide variety of techniques have been proposed. Lip reading computing plays a very important role in automatic speech recognition. For automatic lip reading, there are many competing methods for feature extraction. Often, because of the complexity of the task these methods are tested on only quite restricted datasets.
Visual information derived from visual features and most important accurate lip extraction, can provide features invariant to noise perturbation for speech recognition systems and can be also used in a wide variety of applications.
There are many techniques available to extract the visual features. Lip reading performance degrades dramatically due to the speaker variability encoded in the visual features. In this paper we compare strategies, both High-Level and Low-Level analysis. In, High-Level Visual speech Analysis we will describe the Active shape model (ASM) and Active Appearance Model (AAM) and low-level, pixel-based methods for identification and extraction of salient visual features and compare them. Active Contour Model (ACM) known as “SNAKES” and Comparison of Model based and Image based methods also describe. This paper has put forward a way to select and extract visual features effectively for automatic lip reading. Keywords:
Lip Reading, Visual Feature Extraction, Active Shape Model (ASM), Adaptive Appearance model (AAM), Snakes.