Voicing detection based on adaptive aperiodicity thresholding for speech enhancement in non-stationary noise

被引：4

作者：

Cabanas-Molero, Pablo ^{[1
]}

Martinez-Munoz, Damian ^{[1
]}

Vera-Candeas, Pedro ^{[1
]}

Ruiz-Reyes, Nicolas ^{[1
]}

Jose Rodriguez-Serrano, Francisco ^{[1
]}

机构：

[1] Univ Jaen, Polytech Sch, Dept Telecommun Engn, Jaen 23700, Spain

来源：

IET SIGNAL PROCESSING | 2014年 / 8卷 / 02期

关键词：

hearing aids; speech enhancement; signal-to-noise ratios; voicing classifier; speech sentences database; fluctuating noise; signal-adaptive decision; nonstationary noise; adaptive aperiodicity thresholding; voicing detection; FUNDAMENTAL-FREQUENCY ESTIMATION; SPECTRAL SUBTRACTION; ENVIRONMENTS; ESTIMATOR;

D O I：

10.1049/iet-spr.2012.0224

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this study, the authors present a novel voicing detection algorithm which employs the well-known aperiodicity measure to detect voiced speech in signals contaminated with non-stationary noise. The method computes a signal-adaptive decision threshold which takes into account the current noise level, enabling voicing detection by direct comparison with the extracted aperiodicity. This adaptive threshold is updated at each frame by making a simple estimate of the current noise power, and thus is adapted to fluctuating noise conditions. Once the aperiodicity is computed, the method only requires a small number of operations, and enables its implementation in challenging devices (such as hearing aids) if an efficient approximation of the difference function is employed to extract the aperiodicity. Evaluation over a database of speech sentences degraded by several types of noise reveals that the proposed voicing classifier is robust against different noises and signal-to-noise ratios. In addition, to evaluate the applicability of the method for speech enhancement, a simple F-0-based speech enhancement algorithm integrating the proposed classifier is implemented. The system is shown to achieve competitive results, in terms of objective measures, when compared with other well-known speech enhancement approaches.

引用

页码：119 / 130

页数：12

共 50 条

[41] Non-stationary content-adaptive projector resolution enhancement
Hu, Xiaodan
Naiel, Mohamed A.
Azimifar, Zohreh
Ben Daya, Ibrahim
Lamm, Mark
Fieguth, Paul
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 97
[42] Adaptive Gaussian Filter Based on ICEEMDAN Applying in Non-Gaussian Non-stationary Noise
Zhang, Yusen
Xu, Zixin
Yang, Ling
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (7) : 4272 - 4297
[43] Weighted Noise Subtraction and Adaptive Soft-Thresholding Approach to Speech Enhancement
Das, Somlal
Hamid, Md. Ekramul
Hirose, Keikichi
Molla, Md. Khademul Islam
2011 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2011, : 2413 - 2416
[44] Underwater Non-stationary Acoustic Signal Detection Based on the STHOC Noise Suppression
Shi, Bo
Cao, Tianyu
Ge, Qiqi
Wang, Zitao
Guo, Wenbo
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024,
[45] Modelling non-stationary noise with spectral factorisation in automatic speech recognition
Hurmalainen, Antti
Gemmeke, Jort F.
Virtanen, Tuomas
COMPUTER SPEECH AND LANGUAGE, 2013, 27 (03): : 763 - 779
[46] Markovian Segmentation of Non-stationary Data Corrupted by Non-stationary Noise
Habbouchi, Ahmed
Boudaren, Mohamed El Yazid
Senouci, Mustapha Reda
Aissani, Amar
ADVANCES IN COMPUTING SYSTEMS AND APPLICATIONS, 2022, 513 : 27 - 37
[47] Blind Adaptive Mask to Improve Intelligibility of Non-Stationary Noisy Speech
Farias, F.
Coelho, R.
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1170 - 1174
[48] Towards non-stationary model-based noise adaptation for large vocabulary speech recognition
Kristjansson, T
Frey, B
Deng, L
Acero, A
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 337 - 340
[49] Mask Estimation in Non-stationary Noise Environments for Missing Feature Based Robust Speech Recognition
Badiezadegan, Shirin
Rose, Richard C.
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2062 - 2065
[50] Statistical and Neural Network Based Speech Activity Detection in Non-Stationary Acoustic Environments
Heitkaemper, Jens
Schmalenstroeer, Joerg
Haeb-Umbach, Reinhold
INTERSPEECH 2020, 2020, : 2597 - 2601

← 1 2 3 4 5 →