Speaker-Independent Speech Emotion Recognition Based on CNN-BLSTM and Multiple SVMs

被引:4
|
作者
Liu, Zhen-Tao [1 ,2 ]
Xiao, Peng [1 ,2 ]
Li, Dan-Yun [1 ,2 ]
Hao, Man [1 ,2 ]
机构
[1] China Univ Geosci, Sch Automat, Wuhan 430074, Hubei, Peoples R China
[2] Hubei Key Lab Adv Control & Intelligent Automat C, Wuhan 430074, Hubei, Peoples R China
基金
中国国家自然科学基金;
关键词
Speech emotion recognition; Speaker-independent; Long short-term memory network; Support vector machine; FEATURES; MODEL;
D O I
10.1007/978-3-030-27535-8_43
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speaker-independent speech emotion recognition (SER) is a complex task because of the variations among different speakers, such as gender, age and other emotional irrelevant factors, which may lead to a tremendous difference among emotional features' distribution. To alleviate the adverse effect generated by emotional irrelevant factors, we propose a SER model that consists of convolutional neutral networks (CNN), attention-based bidirectional long short-term memory network (BLSTM), and multiple linear support vector machines. The log Mel-spectrogram with its velocity (delta) and acceleration (double delta) coefficients are adopted as the inputs of our model since they can apply sufficient information for feature learning by our model. Several groups of speaker-independent SER experiments are performed on the Interactive Emotional Dyadic Motion Capture Database (IEMOCAP) database to improve the credibility of the results. Experimental results show that our method obtains unweighted average recall of 61.50% and weighted average recall of 62.31% for speaker-independent SER on IEMOCAP database.
引用
收藏
页码:481 / 491
页数:11
相关论文
共 50 条
  • [1] Gender-Aware CNN-BLSTM for Speech Emotion Recognition
    Zhang, Linjuan
    Wang, Longbiao
    Dang, Jianwu
    Guo, Lili
    Yu, Qiang
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I, 2018, 11139 : 782 - 790
  • [2] Speaker-Independent Speech Emotion Recognition Based Multiple Kernel Learning of Collaborative Representation
    Zha, Cheng
    Zhang, Xinrang
    Zhao, Li
    Liang, Ruiyu
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (03) : 756 - 759
  • [3] Speaker-Independent Speech Emotion Recognition Based on Two-Layer Multiple Kernel Learning
    Jin, Yun
    Song, Peng
    Zheng, Wenming
    Zhao, Li
    Xin, Minghai
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (10): : 2286 - 2289
  • [4] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
    Md Shah Fahad
    Ashish Ranjan
    Akshay Deepak
    Gayadhar Pradhan
    [J]. Circuits, Systems, and Signal Processing, 2022, 41 : 6113 - 6135
  • [5] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
    Fahad, Md Shah
    Ranjan, Ashish
    Deepak, Akshay
    Pradhan, Gayadhar
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (11) : 6113 - 6135
  • [6] Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition
    Lu, Cheng
    Zong, Yuan
    Zheng, Wenming
    Li, Yang
    Tang, Chuangao
    Schuller, Bjoern W.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2217 - 2230
  • [7] Speaker-independent Speech Emotion Recognition Based on Random Forest Feature Selection Algorithm
    Cao, Wei-Hua
    Xu, Jian-Ping
    Liu, Zhen-Tao
    [J]. PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 10995 - 10998
  • [9] A lightweight 2D CNN based approach for speaker-independent emotion recognition from speech with new Indian Emotional Speech Corpora
    Singh, Youddha Beer
    Goel, Shivani
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (15) : 23055 - 23073
  • [10] A lightweight 2D CNN based approach for speaker-independent emotion recognition from speech with new Indian Emotional Speech Corpora
    Youddha Beer Singh
    Shivani Goel
    [J]. Multimedia Tools and Applications, 2023, 82 : 23055 - 23073