A Universal VAD Based on Jointly Trained Deep Neural Networks

被引：0

作者：

Wang, Qing ^{[1
]}

Du, Jun ^{[1
]}

Bao, Xiao ^{[1
]}

Wang, Zi-Rui ^{[1
]}

Dai, Li-Rong ^{[1
]}

Lee, Chin-Hui ^{[2
]}

机构：

[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China

[2] Georgia Inst Technol, Atlanta, GA 30332 USA

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

voice activity detection; deep neural network; feature mapping; joint training; VOICE ACTIVITY DETECTION; SPEECH RECOGNITION; NOISE;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a joint training approach to voice activity detection (VAD) to address the issue of performance degradation due to unseen noise conditions. Two key techniques are integrated into this deep neural network (DNN) based VAD framework. First, a regression DNN is trained to map the noisy to clean speech features similar to DNN-based speech enhancement. Second, the VAD part to discriminate speech against noise backgrounds is also a DNN trained with a large amount of diversified noisy data synthesized by a wide range of additive noise types. By stacking the classification DNN on top of the enhancement DNN, this integrated DNN can be jointly trained to perform VAD. The feature mapping DNN serves as a noise normalization module aiming at explicitly generating the "clean" features which are easier to be correctly recognized by the following classification DNN. Our experiment results demonstrate the proposed noise-universal DNN-based VAD algorithm achieves a good generalization capacity to unseen noises, and the jointly trained DNNs consistently and significantly outperform the conventional classification-based DNN for all the noise types and signal-to-noise levels tested.

引用

页码：2282 / 2286

页数：5

共 50 条

[31] Optimization Techniques for Conversion of Quantization Aware Trained Deep Neural Networks to Lightweight Spiking Neural Networks
Lee, Kyungchul
Choi, Sunghyun
Lew, Dongwoo
Park, Jongsun
2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,
[32] Predicting the outputs of finite deep neural networks trained with noisy gradients
Naveh, Gadi
Ben David, Oded
Sompolinsky, Haim
Ringel, Zohar
PHYSICAL REVIEW E, 2021, 104 (06)
[33] Deep Directly-Trained Spiking Neural Networks for Object Detection
Su, Qiaoyi
Chou, Yuhong
Hu, Yifan
Li, Jianing
Mei, Shijie
Zhang, Ziyang
Li, Guoqi
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6532 - 6542
[34] Removing Neurons From Deep Neural Networks Trained With Tabular Data
Klemetti, Antti
Raatikainen, Mikko
Kivimaki, Juhani
Myllyaho, Lalli
Nurminen, Jukka K.
IEEE OPEN JOURNAL OF THE COMPUTER SOCIETY, 2024, 5 : 542 - 552
[35] OSNR Monitoring by Deep Neural Networks Trained with Asynchronously Sampled Data
Tanimura, Takahito
Hoshida, Takeshi
Rasmussen, Jens C.
Suzuki, Makoto
Morikawa, Hiroyuki
2016 21ST OPTOELECTRONICS AND COMMUNICATIONS CONFERENCE (OECC) HELD JOINTLY WITH 2016 INTERNATIONAL CONFERENCE ON PHOTONICS IN SWITCHING (PS), 2016,
[36] Deep neural networks watermark via universal deep hiding and metric learning
Zhicheng Ye
Xinpeng Zhang
Guorui Feng
Neural Computing and Applications, 2024, 36 : 7421 - 7438
[37] Deep neural networks watermark via universal deep hiding and metric learning
Ye, Zhicheng
Zhang, Xinpeng
Feng, Guorui
NEURAL COMPUTING & APPLICATIONS, 2024, 36 (13): : 7421 - 7438
[38] Universal backdoor attack on deep neural networks for malware detection
Zhang, Yunchun
Feng, Fan
Liao, Zikun
Li, Zixuan
Yao, Shaowen
APPLIED SOFT COMPUTING, 2023, 143
[39] Deep Neural Networks, Generic Universal Interpolation, and Controlled ODEs
Cuchiero, Christa
Larsson, Martin
Teichmann, Josef
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2020, 2 (03): : 901 - 919
[40] GRAPH EXPANSIONS OF DEEP NEURAL NETWORKS AND THEIR UNIVERSAL SCALING LIMITS
Cirone, Nicola Muça
Hamdan, Jad
Salvi, Cristopher
arXiv,

← 1 2 3 4 5 →