A Universal VAD Based on Jointly Trained Deep Neural Networks

被引:0
|
作者
Wang, Qing [1 ]
Du, Jun [1 ]
Bao, Xiao [1 ]
Wang, Zi-Rui [1 ]
Dai, Li-Rong [1 ]
Lee, Chin-Hui [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
voice activity detection; deep neural network; feature mapping; joint training; VOICE ACTIVITY DETECTION; SPEECH RECOGNITION; NOISE;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a joint training approach to voice activity detection (VAD) to address the issue of performance degradation due to unseen noise conditions. Two key techniques are integrated into this deep neural network (DNN) based VAD framework. First, a regression DNN is trained to map the noisy to clean speech features similar to DNN-based speech enhancement. Second, the VAD part to discriminate speech against noise backgrounds is also a DNN trained with a large amount of diversified noisy data synthesized by a wide range of additive noise types. By stacking the classification DNN on top of the enhancement DNN, this integrated DNN can be jointly trained to perform VAD. The feature mapping DNN serves as a noise normalization module aiming at explicitly generating the "clean" features which are easier to be correctly recognized by the following classification DNN. Our experiment results demonstrate the proposed noise-universal DNN-based VAD algorithm achieves a good generalization capacity to unseen noises, and the jointly trained DNNs consistently and significantly outperform the conventional classification-based DNN for all the noise types and signal-to-noise levels tested.
引用
收藏
页码:2282 / 2286
页数:5
相关论文
共 50 条
  • [1] A NOVEL PITCH EXTRACTION BASED ON JOINTLY TRAINED DEEP BLSTM RECURRENT NEURAL NETWORKS WITH BOTTLENECK FEATURES
    Liu, Bin
    Tao, Jianhua
    Zhang, Dawei
    Zheng, Yibin
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 336 - 340
  • [2] Jointly Trained Sequential Labeling and Classification by Sparse Attention Neural Networks
    Ma, Mingbo
    Zhao, Kai
    Huang, Liang
    Xiang, Bing
    Zhou, Bowen
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3334 - 3338
  • [3] A Novel Unified Framework for Speech Enhancement and Bandwidth Extension Based on Jointly Trained Neural Networks
    Liu, Bin
    Tao, Jianhua
    Zheng, Yibin
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 11 - 15
  • [4] Neural networks trained by weight permutation are universal approximators
    Cai, Yongqiang
    Chen, Gaohang
    Qiao, Zhonghua
    NEURAL NETWORKS, 2025, 187
  • [5] Parameter inference with deep jointly informed neural networks
    Humbird, Kelli D.
    Peterson, J. Luc
    McClarren, Ryan G.
    STATISTICAL ANALYSIS AND DATA MINING, 2019, 12 (06) : 496 - 504
  • [6] Deep physical neural networks trained with backpropagation
    Wright, Logan G.
    Onodera, Tatsuhiro
    Stein, Martin M.
    Wang, Tianyu
    Schachter, Darren T.
    Hu, Zoey
    McMahon, Peter L.
    NATURE, 2022, 601 (7894) : 549 - +
  • [7] Deep physical neural networks trained with backpropagation
    Logan G. Wright
    Tatsuhiro Onodera
    Martin M. Stein
    Tianyu Wang
    Darren T. Schachter
    Zoey Hu
    Peter L. McMahon
    Nature, 2022, 601 : 549 - 555
  • [8] Ensemble of jointly trained deep neural network-based acoustic models for reverberant speech recognition
    Lee, Moa
    Lee, Jeehye
    Chang, Joon-Hyuk
    DIGITAL SIGNAL PROCESSING, 2019, 85 : 1 - 9
  • [9] Universal Rules for Fooling Deep Neural Networks based Text Classification
    Li, Di
    Vargas, Danilo Vasconcellos
    Kouichi, Sakurai
    2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 2221 - 2228
  • [10] Trained Rank Pruning for Efficient Deep Neural Networks
    Xu, Yuhui
    Li, Yuxi
    Zhang, Shuai
    Wen, Wei
    Wang, Botao
    Dai, Wenrui
    Qi, Yingyong
    Chen, Yiran
    Lin, Weiyao
    Xiong, Hongkai
    FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 14 - 17