A PHONEME-BASED PRE-TRAINING APPROACH FOR DEEP NEURAL NETWORK WITH APPLICATION TO SPEECH ENHANCEMENT

被引:0
|
作者
Chazan, Shlomo E. [1 ]
Gannot, Sharon [1 ]
Goldberger, Jacob [1 ]
机构
[1] Bar Ilan Univ, Fac Engn, IL-5290002 Ramat Gan, Israel
关键词
neural network; phoneme; deep learning; NOISE; RECOGNITION; BINARY;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this study, we present a new phoneme-based deep neural network (DNN) framework for single microphone speech enhancement. While most speech enhancement algorithms overlook the phoneme structure of the speech signal, our proposed framework comprises a set of phoneme-specific DNNs (pDNNs), one for each phoneme, together with an additional phoneme-classification DNN (cDNN). The cDNN is responsible for determining the posterior probability that a specific phoneme was uttered. Concurrently, each of the pDNNs estimates a phoneme-specific speech presence probability (pSPP). The speech presence probability (SPP) is then calculated as a weighted averaging of the phoneme-specific pSPPs, with the weights determined by the posterior phoneme probability. A soft spectral attenuation, based on the SPP, is then applied to enhance the noisy speech signal. We further propose a compound training procedure, where each pDNN is first pre-trained using the phoneme labeling and the cDNN is trained to classify phonemes. Since these labels are unavailable in the test phase, the entire network is then trained using the noisy utterance, with the cDNN providing phoneme classification. A series of experiments in different noise types verifies the applicability of the new algorithm to the task of speech enhancement. Moreover, the proposed scheme outperforms other schemes that either do not consider the phoneme structure or use simpler training methodology.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] A Novel Approach to Speech Enhancement Based on Deep Neural Networks
    Salehi, Maryam
    Mirzakuchaki, Sattar
    [J]. ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2022, 22 (02) : 71 - 78
  • [22] DEEP: DEnoising Entity Pre-training for Neural Machine Translation
    Hu, Junjie
    Hayashi, Hiroaki
    Cho, Kyunghyun
    Neubig, Graham
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 1753 - 1766
  • [23] A DEEP CONVOLUTIONAL NEURAL NETWORK, WITH PRE-TRAINING, FOR SOLAR PHOTOVOLTAIC ARRAY DETECTION IN AERIAL IMAGERY
    Malof, Jordan M.
    Collins, Leslie M.
    Bradbury, Kyle
    [J]. 2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 874 - 877
  • [24] A Single-channel Speech Enhancement Approach Based on Perceptual Masking Deep Neural Network
    [J]. Zhang, Xiong-Wei (xwzhang9898@163.com), 2017, Science Press (43):
  • [25] MALUP: A Malware Classification Framework using Convolutional Neural Network with Deep Unsupervised Pre-training
    Qiang, Qian
    Cheng, Mian
    Zhou, Yuan
    Ding, Yu
    Qi, Zisen
    [J]. 2021 IEEE 20TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2021), 2021, : 627 - 634
  • [26] A Maximum Likelihood Approach to Masking-based Speech Enhancement Using Deep Neural Network
    Wang, Qing
    Du, Jun
    Chai, Li
    Dai, Li-Rong
    Lee, Chin-Hui
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 295 - 299
  • [27] Synthetic pre-training for neural-network interatomic potentials
    Gardner, John L. A.
    Baker, Kathryn T.
    Deringer, Volker L.
    [J]. MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (01):
  • [28] A STUDY OF TRAINING TARGETS FOR DEEP NEURAL NETWORK-BASED SPEECH ENHANCEMENT USING NOISE PREDICTION
    Odelowo, Babafemi O.
    Anderson, David V.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5409 - 5413
  • [29] A Novel Training Strategy Using Dynamic Data Generation for Deep Neural Network Based Speech Enhancement
    He, Mao-Kui
    Du, Jun
    Wang, Zi-Rui
    Sun, Lei
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1228 - 1232
  • [30] Pre-Training of an Artificial Neural Network for Software Fault Prediction
    Owhadi-Kareshk, Moein
    Sedaghat, Yasser
    Akbarzadeh-T, Mohammad-R
    [J]. PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2017, : 223 - 228