Item Type | Conference or Workshop Item (Paper) |
Abstract | Recognizing the emotions of the elderly is important as it could give an insight into their mental health. Emotion recognition systems that work well on the elderly could be used to assess their emotions in places such as nursing homes and could inform the development of various activities and interventions to improve their mental health. However, several emotion recognition systems are developed using data from younger adults. In this work, we train machine learning models to recognize the emotions of elderly individuals via performing a 3-class classification of valence and arousal as part of the INTERSPEECH 2020 Computational Paralinguistics Challenge (COMPARE). We used speech data from 87 participants who gave spontaneous personal narratives. We leveraged a transfer learning approach in which we used pretrained CNN and BERT models to extract acoustic and linguistic features respectively and fed them into separate machine learning models. Also, we fused these two modalities in a multimodal approach. Our best model used a linguistic approach and outperformed the official competition of unweighted average recall (UAR) baseline for valence by 8.8% and the mean of valence and arousal by 3.2%. We also showed that feature engineering is not necessary as transfer learning without fine-tuning performs as well or better and could be leveraged for the task of recognizing the emotions of elderly individuals. This work is a step towards better recognition of the emotions of the elderly which could eventually inform the development of interventions to manage their mental health. |
Authors | Boateng, George & Kowatsch, Tobias |
Language | English |
Subjects | computer science information management social sciences health sciences |
HSG Classification | contribution to scientific community |
HSG Profile Area | SoM - Business Innovation |
Date | 25 October 2020 |
Publisher | ACM |
Place of Publication | New York, NY, USA |
Page Range | 12-16 |
Title of Book | ICMI 2020 Late Breaking Results, Companion |
Event Title | 22nd ACM International Conference on Multimodal Interaction (ICMI) |
Event Location | virtual |
Event Dates | October 25-29 |
Official URL | https://doi.org/10.1145/3395035.3425255 |
Depositing User | Prof. Dr. Tobias Kowatsch |
Date Deposited | 04 Jan 2021 08:32 |
Last Modified | 04 Jan 2021 08:42 |
URI: | https://www.alexandria.unisg.ch/publications/261890 |
Download
CitationBoateng, George & Kowatsch, Tobias: Speech Emotion Recognition among Elderly Individuals using Multimodal Fusion and Transfer Learning. 2020. - 22nd ACM International Conference on Multimodal Interaction (ICMI). - virtual. Statisticshttps://www.alexandria.unisg.ch/id/eprint/261890
|