Investigating Speech Enhancement and Perceptual Quality for Speech Emotion Recognition

被引:10
|
作者
Avila, Anderson R. [1 ,2 ]
Alam, Jahangir [2 ]
O'Shaughnessy, Douglas [1 ]
Falk, Tiago H. [1 ]
机构
[1] Univ Quebec, INRS EMT, Ste Foy, PQ, Canada
[2] CRIM, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
speech recognition; human-computer interaction; computational paralinguistics;
D O I
10.21437/Interspeech.2018-2350
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, the performance of two enhancement algorithms is investigated in terms of perceptual quality as well as in respect to their impact on speech emotion recognition (SER). The SER system adopted is based on the same benchmark system provided for the AVEC Challenge 2016. The three objective measures adopted are the speech-to-reverberation modulation energy ratio (SRMR), the perceptual evaluation of speech quality (PESQ) and the perceptual objective listening quality assessment (POLQA). Evaluations are conducted on speech files from the RECOLA dataset, which provides spontaneous interactions in French of 27 subjects. Clean speech files are corrupted with different levels of background noise and reverberation. Results show that applying enhancement prior to the SER task can improve SER performance in more degraded scenarios. We also show that quality measures can be an important asset as indicator of enhancement algorithms performance towards SER, with SRMR and POLQA providing the most reliable results.
引用
收藏
页码:3663 / 3667
页数:5
相关论文
共 50 条
  • [31] NETWORKS FOR SPEECH ENHANCEMENT AND AUTOMATIC SPEECH RECOGNITION
    Vu, Thanh T.
    Bigot, Benjamin
    Chng, Eng Siong
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 499 - 503
  • [32] β-Masking MMSE Speech Enhancement for Speech Recognition
    You, Chang Huai
    Ma, Bin
    [J]. 2017 IEEE 2ND INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP), 2017, : 341 - 345
  • [33] SPEECH ENHANCEMENT FOR TELEPHONY NAME SPEECH RECOGNITION
    You, Chang Huai
    Rahardja, Susanto
    Li, Haizhou
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 973 - 976
  • [34] Noisy speech recognition based on speech enhancement
    Wang, Xia
    Tang, Hongmei
    Zhao, Xiaoqun
    [J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 713 - +
  • [35] MODIFICATION ON LSA SPEECH ENHANCEMENT FOR SPEECH RECOGNITION
    You, Chang Huai
    Ma, Bin
    Ni, Chongjia
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5475 - 5479
  • [36] The Impact of Face Mask and Emotion on Automatic Speech Recognition (ASR) and Speech Emotion Recognition (SER)
    Oh, Qi Qi
    Seow, Chee Kiat
    Yusuff, Mulliana
    Pranata, Sugiri
    Cao, Qi
    [J]. 2023 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYTICS, ICCCBDA, 2023, : 523 - 531
  • [37] Speech Emotion Recognition Based on Minimal Voice Quality Features
    Jacob, Agnes
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), VOL. 1, 2016, : 886 - 890
  • [38] Adaptive Filter for Perceptual Speech Enhancement
    Alaya, Sana
    Zoghlami, Novlene
    Lachiri, Zied
    [J]. 2015 IEEE 12TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2015,
  • [39] Speech emotion recognition based on emotion perception
    Gang Liu
    Shifang Cai
    Ce Wang
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [40] Autoencoder With Emotion Embedding for Speech Emotion Recognition
    Zhang, Chenghao
    Xue, Lei
    [J]. IEEE ACCESS, 2021, 9 : 51231 - 51241