Research on speech emotion recognition algorithm for unbalanced data set

被引：0

作者：

Liang Z. ^{[1
]}

Li X. ^{[1
]}

Song W. ^{[1
]}

机构：

[1] Electronic Information Engineering, Changchun University of Science and Technology, Jilin Province

来源：

Journal of Intelligent and Fuzzy Systems | 2020年 / 39卷 / 03期

关键词：

CRNN; focal loss; spectrograms; Speech emotion recognition;

D O I：

10.3233/JIFS-191129

中图分类号：

学科分类号：

摘要：

In speech emotion recognition, most emotional corpora generally have problems such as inconsistent sample length and imbalance of sample categories. Considering these problems, in this paper, a variable length input CRNN deep learning model based on Focal Loss is proposed for speech emotion recognition of anger, happiness, neutrality and sadness in IEMOCAP emotional corpus. In this model, Firstly, a variable-length strategy is introduced to input the speech spectra of the filled speech samples into CNN. Then the effective part of the input sequence is preserved and output by masking matrix and convolution layer. Thirdly, the effective output of input sequence is input into BiGRU network for learning. Finally, the focal loss is used for network training to control and adjust the contribution of various samples to the total loss. Compared with the traditional speech emotion recognition model, simulations show that our method can effectively improve the accuracy and performance of emotion recognition. © 2020 - IOS Press and the authors. All rights reserved.

引用

页码：2791 / 2796

页数：5

共 50 条

[1] Research Based on Unbalanced Data Set Classification Algorithm
Li, Junming
Wang, Peng
Liu, Xiaojian
6TH INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN, MANUFACTURING, MODELING AND SIMULATION (CDMMS 2018), 2018, 1967
[2] Speech emotion recognition based on rough set and SVM
Zhou, Jian
Wang, Guoyin
Yang, Yong
Chen, Peijun
PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 53 - 61
[3] Speech emotion recognition using a novel feature set
Yang, J. (jsjyj0801@163.com), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09):
[4] Deep ganitrus algorithm for speech emotion recognition
Shukla, Shilpi
Jain, Madhu
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (05) : 5353 - 5368
[5] Optimization Research and Application of Unbalanced Data Set Multi-classification Algorithm
Ren, Leng
Zhou, Weimin
2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL. 2, 2016, : 39 - 42
[6] Speech emotion recognition research: an analysis of research focus
Mustafa, Mumtaz Begum
Yusoof, Mansoor A. M.
Don, Zuraidah M.
Malekzadeh, Mehdi
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (01) : 137 - 156
[7] Research on Emergency Parking Instruction Recognition Based on Speech Recognition and Speech Emotion Recognition
Tian Kexin
Huang Yongming
Zhang Guobao
Zhang Lin
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 2933 - 2937
[8] Research on Mandarin Chinese in Speech Emotion Recognition
Wang, Ziyun
Guo, Xiao
2022 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING, MLNLP 2022, 2022, : 99 - 103
[9] Speech emotion recognition using data augmentation
V. M. Praseetha
P. P. Joby
International Journal of Speech Technology, 2022, 25 : 783 - 792
[10] Effects of Data Augmentations on Speech Emotion Recognition
Atmaja, Bagus Tris
Sasou, Akira
SENSORS, 2022, 22 (16)

← 1 2 3 4 5 →