Research on Speech Emotion Recognition Based on the Fractional Fourier Transform

被引:10
|
作者
Huang, Lirong [1 ]
Shen, Xizhong [1 ]
机构
[1] Shanghai Inst Technol, Sch Elect & Elect Engn, Shanghai 201418, Peoples R China
关键词
speech emotion recognition; the fractional fourier transform; MFCC; LSTM; RAVDESS; ambiguity function; ORDER;
D O I
10.3390/electronics11203393
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech emotion recognition is an important part of human-computer interaction, and the use of computers to analyze emotions and extract speech emotion features that can achieve high recognition rates is an important step. We applied the Fractional Fourier Transform (FrFT), and then constructed it to extract MFCC and combined it with a deep learning method for speech emotion recognition. Since the performance of FrFT depends on the transform order p, we utilized an ambiguity function to determine the optimal order for each frame of speech. The MFCC was extracted under the optimal order of FrFT for each frame of speech. Finally, combining the deep learning network LSTM for speech emotion recognition. Our experiment was conducted on the RAVDESS, and detailed confusion matrices and accuracy were given for analysis. The MFCC extracted using FrFT was shown to have better performance than ordinal FT, and the proposed model achieved a weighting accuracy of 79.86%.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Fractional Fourier transform features for speech recognition
    Sarikaya, R
    Gao, YQ
    Saon, G
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 529 - 532
  • [2] Speech emotion recognition using Ramanujan Fourier Transform
    Flower, T. Mary Little
    Jaya, T.
    APPLIED ACOUSTICS, 2022, 201
  • [3] Emotion Recognition Based on Multiple Order Features Using Fractional Fourier Transform
    Ren, Bo
    Liu, Deyin
    Qi, Lin
    NINTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2017), 2017, 10420
  • [4] Research on Active Jamming Recognition Method Based on Fractional Fourier Transform
    Lin, Jiaao
    Gao, Meiguo
    2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 137 - 143
  • [5] Acoustic features based on auditory model and adaptive fractional Fourier transform for speech recognition
    YIN Hui XIE Xiang~+ KUANG Jingming (Department of Electronic Engineering
    ChineseJournalofAcoustics, 2011, 30 (04) : 453 - 463
  • [7] Adaptive-Order Fractional Fourier Transform Features for Speech Recognition
    Yin Hui
    Xie Xiang
    Kuang Jingming
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 654 - 657
  • [8] Continuous Wavelet Transform based Speech Emotion Recognition
    Shegokar, Pankaj
    Sircar, Pradip
    2016 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2016,
  • [9] Radar target recognition based on fractional Fourier transform
    Xie, Deguang
    Zhang, Xianda
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2010, 50 (04): : 485 - 488
  • [10] Speech Emotion Recognition Based on Wavelet Transform and Improved HMM
    Han Zhiyan
    Wang Jian
    2013 25TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2013, : 3156 - 3159