About this paper:
Appears in: INTED2015 Proceedings
Publication year: 2015
Pages: 7906-7909
ISBN: 978-84-606-5763-7
ISSN: 2340-1079
Conference name: 9th International Technology, Education and Development Conference
Dates: 2-4 March, 2015
Location: Madrid, Spain
Current progress in the development of devices and sensors allows the input and output of data into computational systems through several ways. Multimodal interaction systems are able to deal with these entries in order to provide a friendlier user experience. Multimodal interaction systems (MIS) architecture is composed with a component to merge input information from different sources, a module to dialogue management and components for data output.

Based on literature (Dumas, B., Lalanne, D., & Oviatt, S., 2009, Turk, M., 2014), we can notice that previously proposed MIS models present some difficulties to provide a more natural interaction with users. Some aspects such as body language, continuous conversation and context awareness are neglected in some approaches. Improvements in human-computer interaction can be achieved by using cognitive psychology studies and multimodal communication. This kind of communication is important to build meaningful, clear and unambiguous information. It goes beyond speech or writing, using images, emotions and feelings.

Research on multimodal interaction has been increasing in human-computer interaction area (Neto et al. 2009). Multimodal interaction systems offers to the users the possibility to act more naturally, as they do when interacting with another person. These systems are used in many areas such as virtual agents, robotics, intelligent kiosks, question answering systems, mobile applications and others (Jaimes, A., & Sebe, N., 2007, Davidsson, M., 2012), that could benefit from more natural interaction features.

Human memory does a complex task and it is one of the main elements for supporting the consciousness of each person. In this regard, the working memory is a cognitive model to describe the human ability to maintain and manipulate information over a short period of time. The working memory has great importance to our mental activities. In psychology, the model of working memory proposed by Baddeley and Hitch (1974) is divided into two parts: the central executive and its subsystems. The central executive is the responsible for coordinating the other subsystems and maintain the focus and attention. The subsystems are divided into three parts: phonological loop, responsible for handle all sounds; episodic buffer, responsible for integrate information from components to long-term memory; and the visuo-spatial sketchpad, responsible for time and space information. According to Baddeley and Hitch (1974), the working memory allows a temporary storage of information and it is responsible for reasoning, comprehension and learning.

The motivation of this work is to achieve a possible advantage by adopting a computational model of working memory for multimodal interaction systems. We believe that the implementation of this cognitive model may result in a more natural interaction with these systems.

This paper presents an approach towards a better performance for dialogue managing based on a computational model of working memory proposed by Baddeley and Hitch. This approach relies in the assumption that a model based on human factors can result in better perception of the interaction by the users. Also are presented studies that support this proposal and the rationale for the model, together with the results obtained with the implementations of this model in a prototype system, that indicates interesting possibilities.
Multimodal interaction systems, working memory.