A PHONEME-BASED PRE-TRAINING APPROACH FOR DEEP NEURAL NETWORK WITH APPLICATION TO SPEECH ENHANCEMENT

被引:0
|
作者
Chazan, Shlomo E. [1 ]
Gannot, Sharon [1 ]
Goldberger, Jacob [1 ]
机构
[1] Bar Ilan Univ, Fac Engn, IL-5290002 Ramat Gan, Israel
关键词
neural network; phoneme; deep learning; NOISE; RECOGNITION; BINARY;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this study, we present a new phoneme-based deep neural network (DNN) framework for single microphone speech enhancement. While most speech enhancement algorithms overlook the phoneme structure of the speech signal, our proposed framework comprises a set of phoneme-specific DNNs (pDNNs), one for each phoneme, together with an additional phoneme-classification DNN (cDNN). The cDNN is responsible for determining the posterior probability that a specific phoneme was uttered. Concurrently, each of the pDNNs estimates a phoneme-specific speech presence probability (pSPP). The speech presence probability (SPP) is then calculated as a weighted averaging of the phoneme-specific pSPPs, with the weights determined by the posterior phoneme probability. A soft spectral attenuation, based on the SPP, is then applied to enhance the noisy speech signal. We further propose a compound training procedure, where each pDNN is first pre-trained using the phoneme labeling and the cDNN is trained to classify phonemes. Since these labels are unavailable in the test phase, the entire network is then trained using the noisy utterance, with the cDNN providing phoneme classification. A series of experiments in different noise types verifies the applicability of the new algorithm to the task of speech enhancement. Moreover, the proposed scheme outperforms other schemes that either do not consider the phoneme structure or use simpler training methodology.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] PHONEME-BASED DISTRIBUTION REGULARIZATION FOR SPEECH ENHANCEMENT
    Liu, Yajing
    Peng, Xiulian
    Xiong, Zhiwei
    Lu, Yan
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 726 - 730
  • [2] Neural speech enhancement with unsupervised pre-training and mixture training
    Hao, Xiang
    Xu, Chenglin
    Xie, Lei
    [J]. NEURAL NETWORKS, 2023, 158 : 216 - 227
  • [3] Broad Phoneme Class Specific Deep Neural Network Based Speech Enhancement
    Karjol, Pavan
    Ghosh, Prasanta Kumar
    [J]. 2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM 2018), 2018, : 372 - 376
  • [4] Phoneme-based Thai speech recognition using fuzzy system and neural network
    Cheirsilp, R
    Santiprabhob, P
    [J]. IC-AI'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 1-III, 2000, : 393 - 397
  • [5] SPEECH ENHANCEMENT WITH MIXTURE OF DEEP EXPERTS WITH CLEAN CLUSTERING PRE-TRAINING
    Chazan, Shlomo E.
    Goldberger, Jacob
    Gannot, Sharon
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 716 - 720
  • [6] A Perceptually Motivated Approach for Speech Enhancement Based on Deep Neural Network
    Han, Wei
    Zhang, Xiongwei
    Min, Gang
    Sun, Meng
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (04): : 835 - 838
  • [7] The Application of Deep Neural Network in Speech Enhancement Processing
    Chen Jian-ming
    Liang Zhi-cheng
    [J]. 2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 1263 - 1266
  • [8] A Novel Adversarial Training Scheme for Deep Neural Network based Speech Enhancement
    Cornell, Samuele
    Principi, Emanuele
    Squartini, Stefano
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [9] Phoneme-based speech recognition via fuzzy neural networks modeling and learning
    Kasabov, NK
    Kozma, R
    Watts, MJ
    [J]. INFORMATION SCIENCES, 1998, 110 (1-2) : 61 - 79
  • [10] Building Automatic Speech Recognition Systems for Moroccan Dialect: A Phoneme-Based Approach
    Abderrahim Ezzine
    Naouar Laaidi
    Ouissam Zealouk
    Hassan Satori
    [J]. SN Computer Science, 5 (6)