Reverb and Noise as Real-World Effects in Speech Recognition Models: A Study and a Proposal of a Feature Set

被引:0
|
作者
Cesarini, Valerio [1 ]
Costantini, Giovanni [1 ]
机构
[1] Univ Roma Tor Vergata, Dept Elect Engn, I-00133 Rome, Italy
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 23期
关键词
speaker recognition; data augmentation; noise; reverb; MFCC; RASTA; speaker verification; SVM; SPEAKER VERIFICATION;
D O I
10.3390/app142311446
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Reverberation and background noise are common and unavoidable real-world phenomena that hinder automatic speaker recognition systems, particularly because these systems are typically trained on noise-free data. Most models rely on fixed audio feature sets. To evaluate the dependency of features on reverberation and noise, this study proposes augmenting the commonly used mel-frequency cepstral coefficients (MFCCs) with relative spectral (RASTA) features. The performance of these features was assessed using noisy data generated by applying reverberation and pink noise to the DEMoS dataset, which includes 56 speakers. Verification models were trained on clean data using MFCCs, RASTA features, or their combination as inputs. They validated on augmented data with progressively increasing noise and reverberation levels. The results indicate that MFCCs struggle to identify the main speaker, while the RASTA method has difficulty with the opposite class. The hybrid feature set, derived from their combination, demonstrates the best overall performance as a compromise between the two. Although the MFCC method is the standard and performs well on clean training data, it shows a significant tendency to misclassify the main speaker in real-world scenarios, which is a critical limitation for modern user-centric verification applications. The hybrid feature set, therefore, proves effective as a balanced solution, optimizing both sensitivity and specificity.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] The effects of using a noise filter and feature selection in action recognition: an empirical study
    Maldonado-Mendez, Carolina
    Hernandez-Mendez, Sergio
    Luisa Solis, Ana
    Vladimir Rios-Figueroa, Homero
    Marin-Hernandez, Antonio
    2017 INTERNATIONAL CONFERENCE ON MECHATRONICS, ELECTRONICS AND AUTOMOTIVE ENGINEERING (ICMEAE), 2017, : 43 - 48
  • [42] Ablation Study of a Multimodal Gat Network on Perfect Synthetic and Real-world Data to Investigate the Influence of Language Models in Invoice Recognition
    Thiee, Lukas-Walter
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024 WORKSHOPS, PT II, 2024, 14936 : 199 - 212
  • [43] Analyzing Machine Learning Models for Activity Recognition Using Homomorphically Encrypted Real-World Smart Home Datasets: A Case Study
    Attaullah, Hasina
    Sanaullah, Sanaullah
    Jungeblut, Thorsten
    APPLIED SCIENCES-BASEL, 2024, 14 (19):
  • [44] A Qualitative Study on the Effects of Real-World Stimuli and Place Familiarity on Presence
    Pouke, Matti
    Ylipulli, Johanna
    Rantala, Satu
    Alavesa, Paula
    Alatalo, Toni
    Ojala, Timo
    2019 IEEE 5TH WORKSHOP ON EVERYDAY VIRTUAL REALITY (WEVR), 2019,
  • [45] The effects of noise on speech recognition in cochlear implant subjects: Predictions and analysis using acoustic models
    Remus, JJ
    Collins, LM
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (18) : 2979 - 2990
  • [46] The Effects of Noise on Speech Recognition in Cochlear Implant Subjects: Predictions and Analysis Using Acoustic Models
    Jeremiah J. Remus
    Leslie M. Collins
    EURASIP Journal on Advances in Signal Processing, 2005
  • [47] Noise-induced hearing loss: Translating risk from animal models to real-world environments
    Le Prell, Colleen G.
    Hammill, Tanisha L.
    Murphy, William J.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 146 (05): : 3646 - 3651
  • [48] Characteristics of Real-World Signal to Noise Ratios and Speech Listening Situations of Older Adults With Mild to Moderate Hearing Loss
    Wu, Yu-Hsiang
    Stangl, Elizabeth
    Chipara, Octav
    Hasan, Syed Shabih
    Welhaven, Anne
    Oleson, Jacob
    EAR AND HEARING, 2018, 39 (02): : 293 - 304
  • [49] An experimental study of countermeasures against threats: real-world effects meet treatment effects
    Chytilek R.
    Mareš M.
    Drmola J.
    Hrbková L.
    Mlejnková P.
    Špačková Z.
    Tóth M.
    Quality & Quantity, 2022, 56 (6) : 4825 - 4840
  • [50] Towards Formal Verification of Real-World SystemC TLM Peripheral Models - A Case Study
    Le, Hoang M.
    Herdt, Vladimir
    Grosse, Daniel
    Drechsler, Rolf
    PROCEEDINGS OF THE 2016 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2016, : 1160 - 1163