Voicing detection based on adaptive aperiodicity thresholding for speech enhancement in non-stationary noise

被引:4
|
作者
Cabanas-Molero, Pablo [1 ]
Martinez-Munoz, Damian [1 ]
Vera-Candeas, Pedro [1 ]
Ruiz-Reyes, Nicolas [1 ]
Jose Rodriguez-Serrano, Francisco [1 ]
机构
[1] Univ Jaen, Polytech Sch, Dept Telecommun Engn, Jaen 23700, Spain
关键词
hearing aids; speech enhancement; signal-to-noise ratios; voicing classifier; speech sentences database; fluctuating noise; signal-adaptive decision; nonstationary noise; adaptive aperiodicity thresholding; voicing detection; FUNDAMENTAL-FREQUENCY ESTIMATION; SPECTRAL SUBTRACTION; ENVIRONMENTS; ESTIMATOR;
D O I
10.1049/iet-spr.2012.0224
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this study, the authors present a novel voicing detection algorithm which employs the well-known aperiodicity measure to detect voiced speech in signals contaminated with non-stationary noise. The method computes a signal-adaptive decision threshold which takes into account the current noise level, enabling voicing detection by direct comparison with the extracted aperiodicity. This adaptive threshold is updated at each frame by making a simple estimate of the current noise power, and thus is adapted to fluctuating noise conditions. Once the aperiodicity is computed, the method only requires a small number of operations, and enables its implementation in challenging devices (such as hearing aids) if an efficient approximation of the difference function is employed to extract the aperiodicity. Evaluation over a database of speech sentences degraded by several types of noise reveals that the proposed voicing classifier is robust against different noises and signal-to-noise ratios. In addition, to evaluate the applicability of the method for speech enhancement, a simple F-0-based speech enhancement algorithm integrating the proposed classifier is implemented. The system is shown to achieve competitive results, in terms of objective measures, when compared with other well-known speech enhancement approaches.
引用
收藏
页码:119 / 130
页数:12
相关论文
共 50 条
  • [41] Non-stationary content-adaptive projector resolution enhancement
    Hu, Xiaodan
    Naiel, Mohamed A.
    Azimifar, Zohreh
    Ben Daya, Ibrahim
    Lamm, Mark
    Fieguth, Paul
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 97
  • [42] Adaptive Gaussian Filter Based on ICEEMDAN Applying in Non-Gaussian Non-stationary Noise
    Zhang, Yusen
    Xu, Zixin
    Yang, Ling
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (7) : 4272 - 4297
  • [43] Weighted Noise Subtraction and Adaptive Soft-Thresholding Approach to Speech Enhancement
    Das, Somlal
    Hamid, Md. Ekramul
    Hirose, Keikichi
    Molla, Md. Khademul Islam
    2011 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2011, : 2413 - 2416
  • [44] Underwater Non-stationary Acoustic Signal Detection Based on the STHOC Noise Suppression
    Shi, Bo
    Cao, Tianyu
    Ge, Qiqi
    Wang, Zitao
    Guo, Wenbo
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024,
  • [45] Modelling non-stationary noise with spectral factorisation in automatic speech recognition
    Hurmalainen, Antti
    Gemmeke, Jort F.
    Virtanen, Tuomas
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (03): : 763 - 779
  • [46] Markovian Segmentation of Non-stationary Data Corrupted by Non-stationary Noise
    Habbouchi, Ahmed
    Boudaren, Mohamed El Yazid
    Senouci, Mustapha Reda
    Aissani, Amar
    ADVANCES IN COMPUTING SYSTEMS AND APPLICATIONS, 2022, 513 : 27 - 37
  • [47] Blind Adaptive Mask to Improve Intelligibility of Non-Stationary Noisy Speech
    Farias, F.
    Coelho, R.
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1170 - 1174
  • [48] Towards non-stationary model-based noise adaptation for large vocabulary speech recognition
    Kristjansson, T
    Frey, B
    Deng, L
    Acero, A
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 337 - 340
  • [49] Mask Estimation in Non-stationary Noise Environments for Missing Feature Based Robust Speech Recognition
    Badiezadegan, Shirin
    Rose, Richard C.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2062 - 2065
  • [50] Statistical and Neural Network Based Speech Activity Detection in Non-Stationary Acoustic Environments
    Heitkaemper, Jens
    Schmalenstroeer, Joerg
    Haeb-Umbach, Reinhold
    INTERSPEECH 2020, 2020, : 2597 - 2601