Investigating Speech Enhancement and Perceptual Quality for Speech Emotion Recognition

被引:10
|
作者
Avila, Anderson R. [1 ,2 ]
Alam, Jahangir [2 ]
O'Shaughnessy, Douglas [1 ]
Falk, Tiago H. [1 ]
机构
[1] Univ Quebec, INRS EMT, Ste Foy, PQ, Canada
[2] CRIM, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
speech recognition; human-computer interaction; computational paralinguistics;
D O I
10.21437/Interspeech.2018-2350
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, the performance of two enhancement algorithms is investigated in terms of perceptual quality as well as in respect to their impact on speech emotion recognition (SER). The SER system adopted is based on the same benchmark system provided for the AVEC Challenge 2016. The three objective measures adopted are the speech-to-reverberation modulation energy ratio (SRMR), the perceptual evaluation of speech quality (PESQ) and the perceptual objective listening quality assessment (POLQA). Evaluations are conducted on speech files from the RECOLA dataset, which provides spontaneous interactions in French of 27 subjects. Clean speech files are corrupted with different levels of background noise and reverberation. Results show that applying enhancement prior to the SER task can improve SER performance in more degraded scenarios. We also show that quality measures can be an important asset as indicator of enhancement algorithms performance towards SER, with SRMR and POLQA providing the most reliable results.
引用
收藏
页码:3663 / 3667
页数:5
相关论文
共 50 条
  • [1] Selective Acoustic Feature Enhancement for Speech Emotion Recognition With Noisy Speech
    Leem, Seong-Gyun
    Fulford, Daniel
    Onnela, Jukka-Pekka
    Gard, David
    Busso, Carlos
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 917 - 929
  • [2] Using Speech Enhancement Preprocessing for Speech Emotion Recognition in Realistic Noisy Conditions
    Zhou, Hengshun
    Du, Jun
    Tu, Yan-Hui
    Lee, Chin-Hui
    [J]. INTERSPEECH 2020, 2020, : 4098 - 4102
  • [3] Voice Quality Features for Speech Emotion Recognition
    Idris, Inshirah
    Salam, Md Sah Hj
    [J]. JOURNAL OF INFORMATION ASSURANCE AND SECURITY, 2015, 10 (04): : 183 - 191
  • [4] Perceptual speech modeling for noisy speech recognition
    Wu, CH
    Chiu, YH
    Lim, H
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 385 - 388
  • [5] Speech Emotion Recognition
    Lalitha, S.
    Madhavan, Abhishek
    Bhushan, Bharath
    Saketh, Srinivas
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRONICS, COMPUTERS AND COMMUNICATIONS (ICAECC), 2014,
  • [6] Towards Robust Speech Emotion Recognition using Deep Residual Networks for Speech Enhancement
    Triantafyllopoulos, Andreas
    Keren, Gil
    Wagner, Johannes
    Steiner, Ingmar
    Schuller, Bjorn W.
    [J]. INTERSPEECH 2019, 2019, : 1691 - 1695
  • [7] Investigating Graph-based Features for Speech Emotion Recognition
    Pentari, Anastasia
    Kafentzis, George
    Tsiknakis, Manolis
    [J]. 2022 IEEE-EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS (BHI) JOINTLY ORGANISED WITH THE IEEE-EMBS INTERNATIONAL CONFERENCE ON WEARABLE AND IMPLANTABLE BODY SENSOR NETWORKS (BSN'22), 2022,
  • [8] Convolutional Neural Network with Spectrogram and Perceptual Features for Speech Emotion Recognition
    Zhang, Linjuan
    Wang, Longbiao
    Dang, Jianwu
    Guo, Lili
    Guan, Haotian
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2018), PT IV, 2018, 11304 : 62 - 71
  • [9] Bayesian Separation With Sparsity Promotion in Perceptual Wavelet Domain for Speech Enhancement and Hybrid Speech Recognition
    Shao, Yu
    Chang, Chip-Hong
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2011, 41 (02): : 284 - 293
  • [10] English speech emotion recognition method based on speech recognition
    Liu, Man
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (2) : 391 - 398