A PHONEME-BASED PRE-TRAINING APPROACH FOR DEEP NEURAL NETWORK WITH APPLICATION TO SPEECH ENHANCEMENT

被引：0

作者：

Chazan, Shlomo E. ^{[1
]}

Gannot, Sharon ^{[1
]}

Goldberger, Jacob ^{[1
]}

机构：

[1] Bar Ilan Univ, Fac Engn, IL-5290002 Ramat Gan, Israel

来源：

2016 IEEE INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC) | 2016年

关键词：

neural network; phoneme; deep learning; NOISE; RECOGNITION; BINARY;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this study, we present a new phoneme-based deep neural network (DNN) framework for single microphone speech enhancement. While most speech enhancement algorithms overlook the phoneme structure of the speech signal, our proposed framework comprises a set of phoneme-specific DNNs (pDNNs), one for each phoneme, together with an additional phoneme-classification DNN (cDNN). The cDNN is responsible for determining the posterior probability that a specific phoneme was uttered. Concurrently, each of the pDNNs estimates a phoneme-specific speech presence probability (pSPP). The speech presence probability (SPP) is then calculated as a weighted averaging of the phoneme-specific pSPPs, with the weights determined by the posterior phoneme probability. A soft spectral attenuation, based on the SPP, is then applied to enhance the noisy speech signal. We further propose a compound training procedure, where each pDNN is first pre-trained using the phoneme labeling and the cDNN is trained to classify phonemes. Since these labels are unavailable in the test phase, the entire network is then trained using the noisy utterance, with the cDNN providing phoneme classification. A series of experiments in different noise types verifies the applicability of the new algorithm to the task of speech enhancement. Moreover, the proposed scheme outperforms other schemes that either do not consider the phoneme structure or use simpler training methodology.

引用

页数：5

共 50 条

[21] A Novel Approach to Speech Enhancement Based on Deep Neural Networks
Salehi, Maryam
Mirzakuchaki, Sattar
[J]. ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2022, 22 (02) : 71 - 78
[22] DEEP: DEnoising Entity Pre-training for Neural Machine Translation
Hu, Junjie
Hayashi, Hiroaki
Cho, Kyunghyun
Neubig, Graham
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 1753 - 1766
[23] A DEEP CONVOLUTIONAL NEURAL NETWORK, WITH PRE-TRAINING, FOR SOLAR PHOTOVOLTAIC ARRAY DETECTION IN AERIAL IMAGERY
Malof, Jordan M.
Collins, Leslie M.
Bradbury, Kyle
[J]. 2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 874 - 877
[24] A Single-channel Speech Enhancement Approach Based on Perceptual Masking Deep Neural Network
[J]. Zhang, Xiong-Wei (xwzhang9898@163.com), 2017, Science Press (43):
[25] MALUP: A Malware Classification Framework using Convolutional Neural Network with Deep Unsupervised Pre-training
Qiang, Qian
Cheng, Mian
Zhou, Yuan
Ding, Yu
Qi, Zisen
[J]. 2021 IEEE 20TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2021), 2021, : 627 - 634
[26] A Maximum Likelihood Approach to Masking-based Speech Enhancement Using Deep Neural Network
Wang, Qing
Du, Jun
Chai, Li
Dai, Li-Rong
Lee, Chin-Hui
[J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 295 - 299
[27] Synthetic pre-training for neural-network interatomic potentials
Gardner, John L. A.
Baker, Kathryn T.
Deringer, Volker L.
[J]. MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (01):
[28] A STUDY OF TRAINING TARGETS FOR DEEP NEURAL NETWORK-BASED SPEECH ENHANCEMENT USING NOISE PREDICTION
Odelowo, Babafemi O.
Anderson, David V.
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5409 - 5413
[29] A Novel Training Strategy Using Dynamic Data Generation for Deep Neural Network Based Speech Enhancement
He, Mao-Kui
Du, Jun
Wang, Zi-Rui
Sun, Lei
[J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1228 - 1232
[30] Pre-Training of an Artificial Neural Network for Software Fault Prediction
Owhadi-Kareshk, Moein
Sedaghat, Yasser
Akbarzadeh-T, Mohammad-R
[J]. PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2017, : 223 - 228

← 1 2 3 4 5 →