Spectral and Cepstral Audio Noise Reduction Techniques in Speech Emotion Recognition

被引:24
|
作者
Pohjalainen, Jouni [1 ]
Ringeval, Fabien [1 ,2 ]
Zhang, Zixing [1 ]
Schuller, Bjoern [1 ,3 ]
机构
[1] Univ Passau, Chair Complex & Intelligent Syst, Passau, Germany
[2] Univ Grenoble Alpes, Lab Informat Grenoble, Grenoble, France
[3] Imperial Coll London, Dept Comp, London, England
关键词
noise reduction; denoising; speech emotion recognition; LINEAR PREDICTION; ENHANCEMENT; ESTIMATOR;
D O I
10.1145/2964284.2967306
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Signal noise reduction can improve the performance of machine learning systems dealing with time signals such as audio. Real-life applicability of these recognition technologies requires the system to uphold its performance level in variable, challenging conditions such as noisy environments. In this contribution, we investigate audio signal denoising methods in cepstral and log-spectral domains and compare them with common implementations of standard techniques. The different approaches are first compared generally using averaged acoustic distance metrics. They are then applied to automatic recognition of spontaneous and natural emotions under simulated smartphone-recorded noisy conditions. Emotion recognition is implemented as support vector regression for continuous-valued prediction of arousal and valence on a realistic multimodal database. In the experiments, the proposed methods are found to generally outperform standard noise reduction algorithms.
引用
收藏
页码:670 / 674
页数:5
相关论文
共 50 条
  • [1] Analysis of Noise Reduction Techniques in Speech Recognition
    Zheng, Bo
    Hu, Jinsong
    Zhang, Ge
    Wu, Yuling
    Deng, Jianshuang
    [J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 928 - 933
  • [2] MANDARIN AUDIO-VISUAL SPEECH RECOGNITION WITH EFFECTS TO THE NOISE AND EMOTION
    Pao, Tsang-Long
    Liao, Wen-Yuan
    Chen, Yu-Te
    Wu, Tsan-Nung
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (02): : 711 - 723
  • [3] Comparison of Noise Reduction Techniques for Dysarthric Speech Recognition
    Mulfari, Davide
    Campobello, Giuseppe
    Gugliandolo, Giovanni
    Celesti, Antonio
    Villari, Massimo
    Donato, Nicola
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON MEDICAL MEASUREMENTS AND APPLICATIONS (MEMEA 2022), 2022,
  • [4] Cepstral gain normalization for noise robust speech recognition
    Yoshizawa, Shingo
    Hayasaka, Noboru
    Wada, Naoya
    Miyanaga, Yoshikazu
    [J]. ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, 1600, (I209-I212):
  • [5] Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition
    Hora, Baveet Singh
    Uthiraa, S.
    Patil, Hemant A.
    [J]. SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 116 - 129
  • [6] NMF-based Cepstral Features for Speech Emotion Recognition
    Lashkari, Milad
    Seyedin, Sanaz
    [J]. 2018 4TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2018, : 189 - 193
  • [7] CEPSTRAL NOISE SUBTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Rehr, Robert
    Gerkmann, Timo
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 375 - 378
  • [8] Recognition of emotion from speech using evolutionary cepstral coefficients
    Bakhshi, Ali
    Chalup, Stephan
    Harimi, Ali
    Mirhassani, Seyed Mostafa
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (47-48) : 35739 - 35759
  • [9] Cepstral gain normalization for noise robust speech recognition
    Yoshizawa, S
    Hayasaka, N
    Wada, N
    Miyanaga, Y
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 209 - 212
  • [10] Recognition of emotion from speech using evolutionary cepstral coefficients
    Ali Bakhshi
    Stephan Chalup
    Ali Harimi
    Seyed Mostafa Mirhassani
    [J]. Multimedia Tools and Applications, 2020, 79 : 35739 - 35759