Signal to Noise Ratio Estimation Based on An Optimal Design of Subband Voice Activity Detection

被引:0
|
作者
Morita, Shota [1 ]
Lu, Xugang [2 ]
Unoki, Masashi [1 ]
机构
[1] Japan Adv Inst Sci & Technol, Sch Informat Sci, Tokyo, Japan
[2] Natl Inst Informat & Commun Technol, Universal Commun Res Inst, Tokyo, Japan
关键词
Signal to noise ratio; voice activity detection; subband processing; decision of threshold;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Estimates of the signal to noise ratio (SNR) of speech play an important role in noise reduction and predictions of speech intelligibility based on the speech transmission index (STI). Techniques of voice activity detection (VAD) must be used explicitly or implicitly during estimates of SNR to detect speech and non-speech sections. The decision of threshold in most studies has been fixed for VAD to speech and non-speech classications during SNR estimates. We argue that xing the decision of the threshold for all testing conditions is not optimal in controlling the false acceptance and miss detection rates of speech. We propose SNR estimates in this paper using a speech and non-speech detection algorithm based on optimizing the trade-off between false speech acceptance and miss detection rates on a receiver operating characteristic (ROC) curve. Rather than xing the decision threshold in VAD for all SNR conditions, we optimally estimate the decision threshold using an ROC curve for each SNR condition. Thresholds are optimized in subband signals on a large training data set composed of various SNR conditions and noise types. After speech and non-speech are detected, SNR is estimated by summarizing the subband powers of speech and noise from all subbands. We applied the proposed method of estimating SNR based on AURORA2J and NOISEX-92 data corpora. The experimental results demonstrated that the proposed method was more accurate than the classical method of estimating SNR. The proposed approach could be used in robust VAD and STI estimates.
引用
收藏
页码:560 / +
页数:2
相关论文
共 50 条
  • [31] Voice Activity Detection Based on Augmented Statistical Noise Suppression
    Obuchi, Yasunari
    Takeda, Ryu
    Kanda, Naoyuki
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [32] Parameter estimation of the homodyned K distribution based on signal to noise ratio
    Martin-Fernandez, Marcos
    Cardenes, Ruben
    Alberola-Lopez, Carlos
    2007 IEEE ULTRASONICS SYMPOSIUM PROCEEDINGS, VOLS 1-6, 2007, : 158 - 161
  • [33] Primary Signal to Noise Ratio Estimation Based on AIC for UWB Systems
    Fujii, Masahiro
    Watanabe, Yu
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2013, E96A (01) : 264 - 273
  • [34] Voice activity detection based on noise classification and dictionary selection
    Xie, Yining
    Huang, Jinjie
    Zhao, Jing
    He, Yongjun
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2016, 44 (12): : 121 - 126
  • [35] The estimation of the signal-to-noise ratio of a nanoparticle
    Fannin, PC
    Raikher, YL
    JOURNAL OF PHYSICS D-APPLIED PHYSICS, 2001, 34 (11) : 1612 - 1616
  • [36] On Noise Robust Voice Activity Detection
    Dekens, Tomas
    Verhelst, Werner
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2660 - 2663
  • [37] Voice activity detection in nonstationary noise
    Tanyer, SG
    Özer, H
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (04): : 478 - 482
  • [38] A Computation Efficient Voice Activity Detector for Low Signal-to-Noise Ratio in Hearing Aids
    Liu, Fangqi
    Demosthenous, Andreas
    2021 IEEE INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2021, : 524 - 528
  • [39] Noise Robust Front-end Processing with Voice Activity Detection based on Periodic to Aperiodic Component Ratio
    Ishizuka, Kentaro
    Nakatani, Tomohiro
    Fujimoto, Masakiyo
    Miyazaki, Noboru
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 757 - 760
  • [40] Noise estimation using negentropy based voice-activity detector
    Prasad, R
    Saruwatari, H
    Shikano, K
    2004 47TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, CONFERENCE PROCEEDINGS, 2004, : 149 - 152