Speech Emotion Recognition by Combining Amplitude and Phase Information Using Convolutional Neural Network

被引:27
|
作者
Guo, Lili [1 ]
Wang, Longbiao [1 ]
Dang, Jianwu [1 ,2 ]
Zhang, Linjuan [1 ]
Guan, Haotian [3 ]
Li, Xiangang [4 ]
机构
[1] Tianjin Univ, Tianjin Key Lab Cognit Comp & Applicat, Tianjin, Peoples R China
[2] Japan Adv Inst Sci & Technol, Nomi, Ishikawa, Japan
[3] Intelligent Spoken Language Technol Tianjin Co, Tianjin, Peoples R China
[4] Didi Chuxing, AI Labs, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
speech emotion recognition; amplitude; phase information; convolutional neural network;
D O I
10.21437/Interspeech.2018-2156
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous studies of speech emotion recognition utilize convolutional neural network (CNN) directly on amplitude spectrogram to extract features. CNN combines with bidirectional long short term memory (BLSTM) has become the state-of-the-art model. However, phase information has been ignored in this model. The importance of phase information in speech processing field is gathering attention. In this paper, we propose feature extraction of amplitude spectrogram and phase information using CNN for speech emotion recognition. The modified group delay cepstral coefficient (MGDCC) and relative phase are used as phase information. Firstly, we analyze the influence of phase information on speech emotion recognition. Then we design a CNN-based feature representation using amplitude and phase information. Finally, experiments were conducted on EmoDB to validate the effectiveness of phase information. Integrating amplitude spectrogram with phase information, the relative emotion error recognition rates are reduced by over 33% in comparison with using only amplitude-based feature.
引用
收藏
页码:1611 / 1615
页数:5
相关论文
共 50 条
  • [31] Cascaded Convolutional Neural Network Architecture for Speech Emotion Recognition in Noisy Conditions
    Nam, Youngja
    Lee, Chankyu
    SENSORS, 2021, 21 (13)
  • [32] Speech Emotion Recognition using Convolutional Recurrent Neural Networks and Spectrograms
    Qamhan, Mustafa A.
    Meftah, Ali H.
    Selouani, Sid-Ahmed
    Alotaibi, Yousef A.
    Zakariah, Mohammed
    Seddiq, Yasser Mohammad
    2020 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2020,
  • [33] Parallelized Convolutional Recurrent Neural Network With Spectral Features for Speech Emotion Recognition
    Jiang, Pengxu
    Fu, Hongliang
    Tao, Huawei
    Lei, Peizhi
    Zhao, Li
    IEEE ACCESS, 2019, 7 : 90368 - 90377
  • [34] Speech Emotion Recognition of Merged Features Based on Improved Convolutional Neural Network
    Peng, Wangyue
    Tang, Xiaoyu
    2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 301 - 305
  • [35] A Method of Speech Coding for Speech Recognition Using a Convolutional Neural Network
    Kubanek, Mariusz
    Bobulski, Janusz
    Kulawik, Joanna
    SYMMETRY-BASEL, 2019, 11 (09): : 1 - 12
  • [36] Facial Emotion Recognition on a Dataset Using Convolutional Neural Network
    Tumen, Vedat
    Soylemez, Omer Faruk
    Ergen, Burhan
    2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
  • [37] Facial Emotion Recognition of Students using Convolutional Neural Network
    Lasri, Imane
    Solh, Anouar Riad
    El Belkacemi, Mourad
    2019 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES (ICDS 2019), 2019,
  • [38] Facial Emotion Recognition Using Deep Convolutional Neural Network
    Pranav, E.
    Kamal, Suraj
    Chandran, Satheesh C.
    Supriya, M. H.
    2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 317 - 320
  • [39] Emotion Recognition of Facial Expression Using Convolutional Neural Network
    Kumar, Pradip
    Kishore, Ankit
    Pandey, Raksha
    INNOVATIVE DATA COMMUNICATION TECHNOLOGIES AND APPLICATION, 2020, 46 : 362 - 369
  • [40] Speaker-Aware Speech Emotion Recognition by Fusing Amplitude and Phase Information
    Guo, Lili
    Wang, Longbiao
    Dang, Jianwu
    Liu, Zhilei
    Guan, Haotian
    MULTIMEDIA MODELING (MMM 2020), PT I, 2020, 11961 : 14 - 25