A Hierarchical Framework Approach for Voice Activity Detection and Speech Enhancement

被引:6
|
作者
Zhang, Yan [1 ,2 ]
Tang, Zhen-min [1 ]
Li, Yan-ping [3 ]
Luo, Yang [2 ]
机构
[1] Nanjing Univ Sci & Technol NUST, Coll Comp Sci & Technol, Nanjing 210094, Jiangsu, Peoples R China
[2] Jinling Inst Technol, Coll Informat Technol, Nanjing 211169, Jiangsu, Peoples R China
[3] Nanjing Univ Posts & Telecommun, Coll Telecommun & Informat Engn, Nanjing 210046, Jiangsu, Peoples R China
来源
关键词
D O I
10.1155/2014/723643
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Accurate and effective voice activity detection (VAD) is a fundamental step for robust speech or speaker recognition. In this study, we proposed a hierarchical framework approach for VAD and speech enhancement. The modified Wiener filter (MWF) approach is utilized for noise reduction in the speech enhancement block. For the feature selection and voting block, several discriminating features were employed in a voting paradigm for the consideration of reliability and discriminative power. Effectiveness of the proposed approach is compared and evaluated to other VAD techniques by using two well-known databases, namely, TIMIT database and NOISEX-92 database. Experimental results show that the proposed method performs well under a variety of noisy conditions.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] SPEECH ACTIVITY DETECTION: AN ECONOMICS APPROACH
    Tsai, T. J.
    Morgan, Nelson
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6842 - 6846
  • [32] SPEECH-CODEBOOK BASED SOFT VOICE ACTIVITY DETECTION
    Heese, Florian
    Niermann, Markus
    Vary, Peter
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4335 - 4339
  • [33] DySANA: Dynamic Speech and Noise Adaptation for Voice Activity Detection
    Weiss, Ron J.
    Kristjansson, Trausti
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 127 - +
  • [34] An analysis of visual speech information applied to voice activity detection
    Sodoyer, David
    Rivet, Bertrand
    Girin, Laurent
    Schwartz, Jean-Luc
    Jutten, Christian
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 601 - 604
  • [35] Speech enhancement on smartphone voice recording
    Atmaja, Bagus Tris
    Farid, Mifta Nur
    Arifianto, Dhany
    [J]. 8TH INTERNATIONAL CONFERENCE ON PHYSICS AND ITS APPLICATIONS (ICOPIA), 2016, 776
  • [36] New speech enhancement approach for formant evolution detection
    Bobadilla, J
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 275 - 282
  • [37] Personality Enhancement for Speaker-dependent Voice Activity Detection
    Maeng, Joon Gyu
    Lee, Min Kyu
    Yun, Seung
    Kim, Sang Hun
    [J]. 12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 535 - 538
  • [38] Simultaneous Speech Detection and Magnitude Squared Spectrum Estimation Approach for Speech Enhancement
    Han, Ruirui
    Ou, Shifeng
    Liu, Wei
    Chen, Chen
    Zhang, Shuo
    [J]. PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 281 - 285
  • [39] Using Voice Activity Detection and Deep Neural Networks with Hybrid Speech Feature Extraction for Deceptive Speech Detection
    Mihalache, Serban
    Burileanu, Dragos
    [J]. SENSORS, 2022, 22 (03)
  • [40] A BAYESIAN HIERARCHICAL MODEL FOR SPEECH ENHANCEMENT
    Laufer, Yaron
    Gannot, Sharon
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 46 - 50