Speech Emotion Recognition among Elderly Individuals using Multimodal Fusion and Transfer Learning

Item Type Conference or Workshop Item (Paper)

Recognizing the emotions of the elderly is important as it could give an insight into their mental health. Emotion recognition systems that work well on the elderly could be used to assess their emotions in places such as nursing homes and could inform the development of various activities and interventions to improve their mental health. However, several emotion recognition systems are developed using data from younger adults. In this work, we train machine learning models to recognize the emotions of elderly individuals via performing a 3-class classification of valence and arousal as part of the INTERSPEECH 2020 Computational Paralinguistics Challenge (COMPARE). We used speech data from 87 participants who gave spontaneous personal narratives. We leveraged a transfer learning approach in which we used pretrained CNN and BERT models to extract acoustic and linguistic features respectively and fed them into separate machine learning models. Also, we fused these two modalities in a multimodal approach. Our best model used a linguistic approach and outperformed the official competition of unweighted average recall (UAR) baseline for valence by 8.8% and the mean of valence and arousal by 3.2%. We also showed that feature engineering is not necessary as transfer learning without fine-tuning performs as well or better and could be leveraged for the task of recognizing the emotions of elderly individuals. This work is a step towards better recognition of the emotions of the elderly which could eventually inform the development of interventions to manage their mental health.

Authors Boateng, George & Kowatsch, Tobias
Language English
Subjects computer science
information management
social sciences
health sciences
HSG Classification contribution to scientific community
HSG Profile Area SoM - Business Innovation
Date 25 October 2020
Publisher ACM
Place of Publication New York, NY, USA
Page Range 12-16
Title of Book ICMI 2020 Late Breaking Results, Companion
Event Title 22nd ACM International Conference on Multimodal Interaction (ICMI)
Event Location virtual
Event Dates October 25-29
Official URL https://doi.org/10.1145/3395035.3425255
Depositing User Prof. Dr. Tobias Kowatsch
Date Deposited 04 Jan 2021 08:32
Last Modified 04 Jan 2021 08:42
URI: https://www.alexandria.unisg.ch/publications/261890


[img] Text
Boateng Kowatsch 2020 ICMI-Elderly.pdf

Download (1MB)


Boateng, George & Kowatsch, Tobias: Speech Emotion Recognition among Elderly Individuals using Multimodal Fusion and Transfer Learning. 2020. - 22nd ACM International Conference on Multimodal Interaction (ICMI). - virtual.


Edit item Edit item