Signal to Noise Ratio Estimation Based on An Optimal Design of Subband Voice Activity Detection

被引:0
|
作者
Morita, Shota [1 ]
Lu, Xugang [2 ]
Unoki, Masashi [1 ]
机构
[1] Japan Adv Inst Sci & Technol, Sch Informat Sci, Tokyo, Japan
[2] Natl Inst Informat & Commun Technol, Universal Commun Res Inst, Tokyo, Japan
关键词
Signal to noise ratio; voice activity detection; subband processing; decision of threshold;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Estimates of the signal to noise ratio (SNR) of speech play an important role in noise reduction and predictions of speech intelligibility based on the speech transmission index (STI). Techniques of voice activity detection (VAD) must be used explicitly or implicitly during estimates of SNR to detect speech and non-speech sections. The decision of threshold in most studies has been fixed for VAD to speech and non-speech classications during SNR estimates. We argue that xing the decision of the threshold for all testing conditions is not optimal in controlling the false acceptance and miss detection rates of speech. We propose SNR estimates in this paper using a speech and non-speech detection algorithm based on optimizing the trade-off between false speech acceptance and miss detection rates on a receiver operating characteristic (ROC) curve. Rather than xing the decision threshold in VAD for all SNR conditions, we optimally estimate the decision threshold using an ROC curve for each SNR condition. Thresholds are optimized in subband signals on a large training data set composed of various SNR conditions and noise types. After speech and non-speech are detected, SNR is estimated by summarizing the subband powers of speech and noise from all subbands. We applied the proposed method of estimating SNR based on AURORA2J and NOISEX-92 data corpora. The experimental results demonstrated that the proposed method was more accurate than the classical method of estimating SNR. The proposed approach could be used in robust VAD and STI estimates.
引用
收藏
页码:560 / +
页数:2
相关论文
共 50 条
  • [41] Determination of the signal to noise ratio of the voice using wavelet transform
    Jimenez, Carlos
    Diaz, Jose A.
    Shrivastav, Rahul
    Rothman, Howard
    Del Pino, Paulino
    INGENIERIA UC, 2005, 12 (01): : 7 - 16
  • [42] OPTIMAL DETECTION OF A SIGNAL IN ARBITRARY NOISE
    GATKIN, NG
    DALETSKI.YL
    THEORY OF PROBILITY AND ITS APPLICATIONS,USSR, 1971, 16 (04): : 728 - 732
  • [43] Appropriate gate time for single molecular photon detection based on optimal signal-to-noise ratio analyses
    Dong Shuang-Li
    Huang Tao
    Liu Yuan
    Wang Jun
    Xiao Lian-Tuan
    Jia Suo-Tang
    CHINESE PHYSICS LETTERS, 2007, 24 (05) : 1224 - 1227
  • [44] Estimation of Optimal Parameter in ε-Filter Based on Signal-Noise Decorrelation
    Matsumoto, Mitsuharu
    Hashimoto, Shuji
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (06): : 1312 - 1315
  • [45] Multi-Subband Radar Signal Fusion Processing Based on Deep Neural Network in Low Signal-to-Noise Ratio
    Jiang, Yilin
    Tang, Sanqiang
    Lu, Manjun
    Zhang, Liting
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [46] Design of optimal polarimeters: maximization of signal-to-noise ratio and minimization of systematic error
    Tyo, JS
    APPLIED OPTICS, 2002, 41 (04) : 619 - 630
  • [47] Noise enhanced signal detection and estimation
    Chen, Hao
    Varshney, Pramod K.
    Michels, James H.
    CONFERENCE RECORD OF THE FORTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1-5, 2007, : 701 - +
  • [48] OPTIMAL ESTIMATION IN SIGNAL-DEPENDENT NOISE
    FROEHLICH, GK
    WALKUP, JF
    ASHER, RB
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA, 1977, 67 (10) : 1368 - 1368
  • [49] OPTIMAL ESTIMATION IN SIGNAL-DEPENDENT NOISE
    FROEHLICH, GK
    WALKUP, JF
    ASHER, RB
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA, 1978, 68 (12) : 1665 - 1672
  • [50] Analysis and optimal design of delayless subband active noise control systems for broadband noise
    Milani, Ali A.
    Kannan, Govind
    Panahi, Issa M. S.
    Briggs, Richard
    SIGNAL PROCESSING, 2010, 90 (04) : 1153 - 1164