Speech Emotion Recognition with Multi-task Learning

被引：23

作者：

Cai, Xingyu ^{[1
]}

Yuan, Jiahong ^{[1
]}

Zheng, Renjie ^{[1
]}

Huang, Liang ^{[1
]}

Church, Kenneth ^{[1
]}

机构：

[1] Baidu Res, Sunnyvale, CA 94089 USA

来源：

INTERSPEECH 2021 | 2021年

关键词：

speech emotion recognition; multi-task learning; MODELS;

D O I：

10.21437/Interspeech.2021-1852

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Speech emotion recognition (SER) classifies speech into emotion categories such as: Happy, Angry, Sad and Neutral. Recently, deep learning has been applied to the SER task. This paper proposes a multi-task learning (MTL) framework to simultaneously perform speech-to-text recognition and emotion classification, with an end-to-end deep neural model based on wav2vec-2.0. Experiments on the IEMOCAP benchmark show that the proposed method achieves the state-of-the-art performance on the SER task. In addition, an ablation study establishes the effectiveness of the proposed MTL framework.

引用

页码：4508 / 4512

页数：5

共 50 条

[21] Speaker independent feature selection for speech emotion recognition: A multi-task approach
Kalhor, Elham
Bakhtiari, Behzad
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) : 8127 - 8146
[22] Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion Recognition
Latif, Siddique
Rana, Rajib
Khalifa, Sara
Jurdak, Raja
Epps, Julien
Schuller, Bjoern W.
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (02) : 992 - 1004
[23] Multi-task Learning for Multi-modal Emotion Recognition and Sentiment Analysis
Akhtar, Md Shad
Chauhan, Dushyant Singh
Ghosal, Deepanway
Poria, Soujanya
Ekbal, Asif
Bhattacharyya, Pushpak
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 370 - 379
[24] Speaker independent feature selection for speech emotion recognition: A multi-task approach
Elham Kalhor
Behzad Bakhtiari
Multimedia Tools and Applications, 2021, 80 : 8127 - 8146
[25] MT-TCCT: Multi-task Learning for Multimodal Emotion Recognition
Wang, Yandan
Chen, Zhongtang
Chen, Shuang
Zhu, Yu
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT III, 2022, 13531 : 429 - 442
[26] Inconsistency-Based Multi-Task Cooperative Learning for Emotion Recognition
Xu, Yifan
Cui, Yuqi
Jiang, Xue
Yin, Yingjie
Ding, Jingting
Li, Liang
Wu, Dongrui
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (04) : 2017 - 2027
[27] Speech Emotion: Investigating Model Representations, Multi-Task Learning and Knowledge Distillation
Mitra, Vikramjit
Chien, Hsiang-Yun Sherry
Kowtha, Vasudha
Cheng, Joseph Yitan
Azemi, Erdrin
INTERSPEECH 2022, 2022, : 4715 - 4719
[28] Poster Abstract: Speech Emotion Recognition via Attention-based DNN from Multi-Task Learning
Ma, Fei
Gu, Weixi
Zhang, Wei
Ni, Shiguang
Huang, Shao-Lun
Zhang, Lin
SENSYS'18: PROCEEDINGS OF THE 16TH CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, 2018, : 363 - 364
[29] Speaker-Aware Multi-Task Learning for Automatic Speech Recognition
Pironkov, Gueorgui
Dupont, Stephane
Dutoit, Thierry
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 2900 - 2905
[30] MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION
Ravanelli, Mirco
Zhong, Jianyuan
Pascual, Santiago
Swietojanski, Pawel
Monteiro, Joao
Trmal, Jan
Bengio, Yoshua
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6989 - 6993

← 1 2 3 4 5 →