Incorporating Laughter into Human Avatar Interactions:
ILHAIRE is a European project (FP7) that started in 2011. Laughter is a significant feature of human communication. ILHAIRE objectives are to help the scientific and industrial community to bridge the gap between knowledge on human laughter and its use by avatars, thus enabling sociable conversational agents to be designed, using natural-looking and natural-sounding laughter. The ILHAIRE project will last 36 months. It will gather data on laughter using high quality sound, video, facial and upper body motion capture. The process of database collection will be grounded in psychological foundations and the data will be used to validate computational and theoretical models of analysis and synthesis of audio-visual laughter. Dialog systems between humans and avatars will be developed and we will conduct studies to capture the qualitative experience evoked by a laughing avatar. The work is split into 7 work packages, each led by one of the consortium partners.
The team of the University of Zurich will be concerned with the psychological foundations of laughter, its expression (focusing on facial features) and experiments evaluating the avatars under a personality perspective. Also, Willibald Ruch is leading the Work Package 5.
WP1: Incremental Database
WP1 provides the resources required for most of the other work packages by assembling existing resources and generating new ones to create an incremental database. Initially existing resources containing audio-visual records of laughter will be assembled. These resources will be used to construct an annotated database showing multimodal records of laughter in naturalistic interactions, and will incorporate a system of labels distinguishing the main kinds and functions of laughter represented in it. WP1 will also use focus groups and large-scale questionnaires to identify a range of material that reliably makes different kinds of people laugh. In the early stages this will focus on material in English; at a later stage it will be extended to other languages and cultures. The laughter-inducing material will also inform experiments designed to induce and record naturalistic laughter using both audio-visual and motion capture recording techniques.
WP2: Multimodal Analysis and Recognition of Laughter
WP2 concerns multimodal
analysis and modelling of laughter. This covers both hilarious and
conversational (social) laughter.
- to infer the sequences of phonemes and facial action units of laughs;
- to define a set of expressive gesture features for analysing gesture during laugh;
- to develop novel fusion algorithms based on the current signal context, for integrating information from the auditory, facial, and gestural channels; to automatically detect and classify laugh, based on such an integrated multimodal information.
Moreover, WP2 aims at investigating the influence of culture and gender to improve laughter detection. The models and techniques developed here will form the ground for the adaptive models to be employed for laughter synthesis. The work is organised in steps, according to a spiral research and development approach, designed to converge toward the final results. A first step uses existing material to produce baseline methods; then these methods are refined for hilarious laughter and finally integrated for specific analyses of social laughter.
WP3: Multimodal Generation and Synthesis of Laughter
WP3 deals with the development of models for generation and synthesis of audiovisual laughter. The task of the Generation stage is to describe laughter episodes at the behavioural level (audio structure, body posture, facial action unit sequences), so as to make them appropriate in a given conversational context. The input of the Generation stage is provided by the Dialogue module (WP4), which makes decisions on the timing, duration, and style of laughter as a function of the application scenario and according to user interaction. The Synthesis stage then provides the acoustic and visual representation of laughter, from this behavioural description. Audio synthesis will be based on statistical parametric synthesis (HMMs), adapted to the specifically inarticulate nature of laughter. Visual synthesis is based on a Finites State Machines in which states will be mostly associated to body, head, or facial postures, and transitions will provide natural looking movements from/to these postures. Both stages heavily rely on the annotated laughter databases obtained by from WP1 and analyzed in WP2. First, for training the HMM models involved in audio synthesis and for having a large collection of examples of visual laughter to choose from for visual synthesis. Second, for using the data in copy synthesis experiments, for checking each stage separately.
WP4: Laughter-enabled Dialog Modelling
The objective of WP4 is
to design an adaptive and data-driven multimodal dialogue management system for
achieving natural and user-friendly interactions integrating laughter. The
dialogue management task will focus on non-linguistic information to generate
acceptable and appropriate types of laughter at appropriate times. Machine
learning methods such as reinforcement learning will serve to optimise the
interaction strategy and to decide the most optimized moments when to generate
laughter. It will first use existing data and data collected in WP1 to generate
baseline dialogue strategies, and inputs from WP2 for adapting the strategy
online afterwards. It will provide information to the laughter synthesis system
developed in WP3.
More specifically, the goals to achieve in this WP will be to:
- Develop and evaluate a robust and adaptive dialogue manager capable of integrating information from the multimodal analysis of the user inputs and laughter recognition (WP2);
- Develop data-driven Reinforcement Learning methods able to handle large state spaces for optimising dialogue management, based on the current context, user inputs, and laughter gathered from WP2 and data collected in WP1;
- Develop imitation-based algorithms based on rules or learning from data to artificially expand data sets and to mimic human users.
WP5: Psychological Foundations of Laughter
Work Package 5 will lay the psychological foundations of laughter. The goals are to understand the factors affecting the conveyance and perception of laughter expressions. Psychologically appropriate methods (quantitative and qualitative) of assessing affective and cognitive responses to laughing avatars will be developed. In experimental settings, personality characteristics, such as trait cheerfulness or dispositions to being laughed at and ridicule (e.g., gelotophobia, the fear of being laughed at) will be considered, as they predispose individuals to different responses to laughter and laughter-related stimuli. It is expected that persons with different personalities will respond differently to avatars and their laughter. The knowledge of those differences will help identifying factors that make an avatar laughter contagious and positively valued (sound, facial expression, intensity). A further goal is to identify a model of mimicry, counter-mimicry and emotional contagion multimodal responses, also under the focus of cultural differences.
WP6: Integration and Evaluations
This WP has two main tasks, one related to the integration of the various technologies developed within ILHAIRE, and one regarding the evaluation of the integrated system and of its components. The first task is concerned with integrating the various components developed within the ILHAIRE project (laugh recognition, dialog manager, laughing conversational model, contagion model, etc). There will be 3 phases for the system integration, one per year. Regarding the evaluation studies, we will verify the contribution of expressive features in laugh through experimental procedures. Congruent and incongruent combinations of facial expression and body movement arising during acted and spontaneous laughter will be shown to participants. Both quantitative and qualitative approaches (e.g., semi-structured interviews, effect on task performance) will be used to evaluate the level of emotional contagion triggered by the avatar's laughter expressions. The results will be used to refine the emotional contagion probabilistic model driving the ECA.
WP7: Dissemination and Exploitation
This work package includes the creation of this web site for the dissemination of project results. Other communication channels will also be used through participation in conferences and creation of showcases, such as a laughter authenticity detector, a hilarious contagious laughter machine, and a laughter driven virtual agent. These showcases will be inspired by the evaluation WP. Other actions will be carried out to maximize the reused potential of project knowledge and software, in connection with other projects of the partners. The WP will also establish the dimensions under which the evaluation will be conducted, and perform those evaluations, in particular in terms of acceptability, believability, added-value and impact.
McKeown, G., Cowie, R., Curran, W., Ruch, W., & Douglas-Cowie, E. (2012). ILHAIRE Laughter Database. 4th International Workshop on Emotion Sentiment & Social Signals (ES³ 2012) - Corpora for Research on Emotion, Sentiment & Social Signals, held in conjunction with LREC 2012, ELRA, Istanbul, Turkey, pp. 32–35.
Hofmann, J., Ruch, W., & Platt, T. (2012). The en- and decoding of schadenfreude laughter. Sheer joy expressed by a Duchenne laugh or emotional blend with a distinct morphological expression? Interdisciplinary Workshop on Laughter and other Non-Verbal Vocalisations in Speech Proceedings, Dublin, Ireland, 26-27 of October, 8-10.
Niewiadomski, R., Pammi, S., Sharma, A., Hofmann, J., Platt, T., Cruz, R.T., Qu, B. (2012). Visual laughter synthesis: Initial approaches. Interdisciplinary Workshop on Laughter and other Non-Verbal Vocalisations in Speech Proceedings, Dublin, Ireland, 26-27 of October, 10-12.
Ruch, W. (2012). Towards a New Structural Model of the Sense of Humor: Preliminary Findings. AAI Symposium on Artificial Intelligence of Humor Proceedings, Washington, USA, 2-4 of November.
Niewiadomski, R., Hofmann, J., Urbain, J., Platt, T., Wagner, J., Piot, B., Cakmak, H., Pammi, S., Baur, T., Dupont, S., Geist, M., Lingenfelser, F., McKeown, G., Pietquin, O., & Ruch, W. (in press). Laugh-aware virtual agent and its impact on user amusement. In T. Ito, C. Jonker, M. Gini, and O. Shehory (Eds.), Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2013). May, 6–10, 2013, Saint Paul, Minnesota, USA.
Platt, T., Hofmann, J., Ruch, W., Niewiadomski, R., & Urbain, J. (2012). Experimental standards in research on AI and humor when considering psychology. Fall Symposium on Artificial Intelligence of Humor Proceedings, Washington, USA, 2-4 of November, 54-61.
Urbain, J., Niewiadomski, R., Hofmann, J., Bantegnie, E., Baur, T., Berthouze, N., Cakmak, H., Cruz, R. T., Dupont, S., Geist, M., Griffin, H., Lingenfelser, F., Mancini, M., Miranda, M., McKeown, G., Pammi, S., Pietquin, O., Piot, B., Platt, T., Ruch, W., Sharma, A., Volpe, G., & Wagner, J. (2013). Laugh Machine. Proceedings eNTERFACE’12. The 8th International Summer Workshop on Multimodal Interfaces, 2nd – 27th of July 2012 (pp. 13-34). Metz, France: Supélec.