Speech detection in non-stationary noise based on the 1/f process

被引:0
|
作者
Fan Wang
Fang Zheng
Wenhu Wu
机构
[1] Tsinghua University,Center of Speech Technology, State Key Laboratory of Intelligent, Technology and Systems Department of Computer Science and Technology
关键词
speech detection; 1/; process; wavelet; robust speech recognition;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, an effective and robust active speech detection method is proposed based on the 1/f process technique for signals under non-stationary noisy environments. The Gaussian 1/f process, a mathematical model for statistically self-similar random processes based on fractals, is selected to model both the speech and the background noise. An optimal Bayesian two-class classifier is developed to discriminate them by their 1/f wavelet coefficients with Karhunen-Loeve-type properties. Multiple templates are trained for the speech signal, and the parameters of the background noise can be dynamically adapted in runtime to model the variation of both the speech and the noise. In our experiments, a 10-minute long speech with different types of noises ranging from 20dB to 5dB is tested using this new detection method. A high performance with over 90% detection accuracy is achieved when average SNR is about 10dB.
引用
收藏
页码:83 / 89
页数:6
相关论文
共 50 条
  • [21] Towards non-stationary model-based noise adaptation for large vocabulary speech recognition
    Kristjansson, T
    Frey, B
    Deng, L
    Acero, A
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 337 - 340
  • [22] An Adaptive Wavelet-Based Denoising Algorithm for Enhancing Speech in Non-stationary Noise Environment
    Wang, Kun-Ching
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (02): : 341 - 349
  • [23] Mask Estimation in Non-stationary Noise Environments for Missing Feature Based Robust Speech Recognition
    Badiezadegan, Shirin
    Rose, Richard C.
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2062 - 2065
  • [24] Statistical and Neural Network Based Speech Activity Detection in Non-Stationary Acoustic Environments
    Heitkaemper, Jens
    Schmalenstroeer, Joerg
    Haeb-Umbach, Reinhold
    [J]. INTERSPEECH 2020, 2020, : 2597 - 2601
  • [25] Correntropy based IPKF filter for parameter estimation in presence of non-stationary noise process
    Sen, Subhamoy
    Criniere, Antoine
    Mevel, Laurent
    Cerou, Frederic
    Dumoulin, Jean
    [J]. IFAC PAPERSONLINE, 2018, 51 (24): : 420 - 427
  • [26] Noise/spike detection in phonocardiogram signal as a cyclic random process with non-stationary period interval
    Naseri, H.
    Homaeinezhad, M. R.
    Pourkhajeh, H.
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2013, 43 (09) : 1205 - 1213
  • [27] PITCH ESTIMATION FOR NON-STATIONARY SPEECH
    Christensen, Mads Graesboll
    Jensen, Jesper Rindom
    [J]. CONFERENCE RECORD OF THE 2014 FORTY-EIGHTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2014, : 1400 - 1404
  • [28] Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments
    Malah, D
    Cox, RV
    Accardi, AJ
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 789 - 792
  • [29] Sparse Hidden Markov Models for Speech Enhancement in Non-Stationary Noise Environments
    Deng, Feng
    Bao, Changchun
    Kleijn, W. Bastiaan
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1973 - 1987
  • [30] Robust Speech Enhancement Techniques for ASR in Non-stationary Noise and Dynamic Environments
    Liu, Gang
    Dimitriadis, Dimitrios
    Bocchieri, Enrico
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3016 - 3020