Research on Speech Emotion Recognition Based on the Fractional Fourier Transform

被引:10
|
作者
Huang, Lirong [1 ]
Shen, Xizhong [1 ]
机构
[1] Shanghai Inst Technol, Sch Elect & Elect Engn, Shanghai 201418, Peoples R China
关键词
speech emotion recognition; the fractional fourier transform; MFCC; LSTM; RAVDESS; ambiguity function; ORDER;
D O I
10.3390/electronics11203393
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech emotion recognition is an important part of human-computer interaction, and the use of computers to analyze emotions and extract speech emotion features that can achieve high recognition rates is an important step. We applied the Fractional Fourier Transform (FrFT), and then constructed it to extract MFCC and combined it with a deep learning method for speech emotion recognition. Since the performance of FrFT depends on the transform order p, we utilized an ambiguity function to determine the optimal order for each frame of speech. The MFCC was extracted under the optimal order of FrFT for each frame of speech. Finally, combining the deep learning network LSTM for speech emotion recognition. Our experiment was conducted on the RAVDESS, and detailed confusion matrices and accuracy were given for analysis. The MFCC extracted using FrFT was shown to have better performance than ordinal FT, and the proposed model achieved a weighting accuracy of 79.86%.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Noise Removal in Speech signal using Fractional Fourier Transform
    Kumar, Prafulla
    Kansal, Sarita
    2017 IEEE INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION, INSTRUMENTATION AND CONTROL (ICICIC), 2017,
  • [42] English speech emotion recognition method based on speech recognition
    Liu, Man
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (2) : 391 - 398
  • [43] English speech emotion recognition method based on speech recognition
    Man Liu
    International Journal of Speech Technology, 2022, 25 : 391 - 398
  • [44] Speech emotion recognition research: an analysis of research focus
    Mustafa, Mumtaz Begum
    Yusoof, Mansoor A. M.
    Don, Zuraidah M.
    Malekzadeh, Mehdi
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (01) : 137 - 156
  • [45] Research on Teacher Classroom Teaching Speech Emotion Recognition Based on LSTM
    He, Yimin
    Lu, Xiaoyong
    Sun, Dan
    Pan, Tao
    Qiu, Yuqing
    Liu, Jiahong
    2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 326 - 331
  • [46] Research on Speech Emotion Recognition Based on AA-CBGRU Network
    Yan, Yu
    Shen, Xizhong
    ELECTRONICS, 2022, 11 (09)
  • [47] Research on speech emotion recognition based on deep auto-encoder
    Wang, Fei
    Ye, Xiaofeng
    Sun, Zhaoyu
    Huang, Yujia
    Zhang, Xing
    Shang, Shengxing
    2016 IEEE INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2016, : 308 - 312
  • [48] Research on Mandarin Chinese in Speech Emotion Recognition
    Wang, Ziyun
    Guo, Xiao
    2022 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING, MLNLP 2022, 2022, : 99 - 103
  • [49] A Research of Speech Emotion Recognition Based on Deep Belief Network and SVM
    Huang, Chenchen
    Gong, Wei
    Fu, Wenlong
    Feng, Dongyu
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [50] Palmprint recognition based on Fourier transform
    Li, Wen-Xin
    Zhang, David
    Xu, Zhuo-Qun
    Ruan Jian Xue Bao/Journal of Software, 2002, 13 (05): : 879 - 886