A Maximum Likelihood Approach to Deep Neural Network Based Speech Dereverberation

被引:0
|
作者
Wang, Xin [1 ]
Du, Jun [1 ]
Wang, Yannan [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
SPEAKER IDENTIFICATION; ALGORITHM;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recently, deep neural network (DNN) based speech dereverberation becomes popular with a standard minimum mean squared error (MMSE) criterion for learning the parameters. In this study, a probabilistic learning framework to estimate the DNN parameters for single-channel speech dereverberation is proposed. First, the statistical analysis shows that the prediction error vector at the DNN output well follows a unimodal density for each log-power spectral component. Accordingly, we present a maximum likelihood (ML) approach to DNN parameter learning by charactering the prediction error vector as a multivariate Gaussian density with a zero mean vector and an unknown co-variance matrix. Our experiments demonstrate that the proposed ML-based DNN learning can achieve a better generalization capability than MMSE-based DNN learning. And all the object measures of speech quality and intelligibility are consistently improved.
引用
收藏
页码:155 / 158
页数:4
相关论文
共 50 条
  • [1] A Maximum Likelihood Approach to Masking-based Speech Enhancement Using Deep Neural Network
    Wang, Qing
    Du, Jun
    Chai, Li
    Dai, Li-Rong
    Lee, Chin-Hui
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 295 - 299
  • [2] A context aware-based deep neural network approach for simultaneous speech denoising and dereverberation
    Sidheswar Routray
    Qirong Mao
    [J]. Neural Computing and Applications, 2022, 34 : 9831 - 9845
  • [3] A context aware-based deep neural network approach for simultaneous speech denoising and dereverberation
    Routray, Sidheswar
    Mao, Qirong
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (12): : 9831 - 9845
  • [4] Neural-Network Supervised Maximum Likelihood-based on-line Dereverberation
    Mosayyebpour, Saeed
    Nesta, Francesco
    [J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1552 - 1556
  • [5] A Maximum Likelihood Approach to Deep Neural Network Based Nonlinear Spectral Mapping for Single-Channel Speech Separation
    Wang, Yannan
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1178 - 1182
  • [6] MAXIMUM-LIKELIHOOD-BASED CEPSTRAL INVERSE FILTERING FOR BLIND SPEECH DEREVERBERATION
    Kumar, Kshitiz
    Stern, Richard M.
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4282 - 4285
  • [7] A Reverberation-Time-Aware Approach to Speech Dereverberation Based on Deep Neural Networks
    Wu, Bo
    Li, Kehuang
    Yang, Minglei
    Lee, Chin-Hui
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) : 102 - 111
  • [8] A Late Reverberation Power Spectral Density Aware Approach to Speech Dereverberation Based on Deep Neural Networks
    Qi, Yuanlei
    Yang, Feiran
    Yang, Jun
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1700 - 1703
  • [9] Optimization of Dereverberation Parameters based on Likelihood of Speech Recognizer
    Gomez, Randy
    Kawahara, Tatsuya
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1259 - 1262
  • [10] Speech Dereverberation Based on Maximum-Likelihood Estimation With Time-Varying Gaussian Source Model
    Nakatani, Tomohiro
    Juang, Biing-Hwang
    Yoshioka, Takuya
    Kinoshita, Keisuke
    Delcroix, Marc
    Miyoshi, Masato
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (08): : 1512 - 1527