Joint Optimization of Denoising Autoencoder and DNN Acoustic Model Based on Multi-target Learning for Noisy Speech Recognition

被引:15
|
作者
Mimura, Masato [1 ]
Sakai, Shinsuke [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, Sch Informat, Sakyo Ku, Kyoto 6068501, Japan
关键词
Speech Recognition; Speech Enhancement; Deep Neural Network (DNN); Denoising Autoencoder (DAE); DEEP NEURAL-NETWORKS; ADAPTATION;
D O I
10.21437/Interspeech.2016-388
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Denoising autoencoders (DAEs) have been investigated for enhancing noisy speech before feeding it to the back-end deep neural network (DNN) acoustic model, but there may be a mismatch between the DAE output and the expected input of the back-end DNN, and also inconsistency between the training objective functions of the two networks. In this paper, a joint optimization method of the front-end DAE and the back-end DNN is proposed based on a multi-target learning scheme. In the first step, the front-end DAE is trained with an additional target of minimizing the errors propagated by the back-end DNN. Then, the unified network of DAE and DNN is fine-tuned for the phone state classification target, with an extra target of input speech enhancement imposed to the DAE part. The proposed method has been evaluated with the CHiME3 ASR task, and demonstrated to improve the baseline DNN as well as the simple coupling of DAE with DNN. The method is also effective as a post-filter of a beamformer.
引用
收藏
页码:3803 / 3807
页数:5
相关论文
共 50 条
  • [1] Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model
    Kocour, Martin
    Zmolikova, Katerina
    Ondel, Lucas
    Svec, Jan
    Delcroix, Marc
    Ochiai, Tsubasa
    Burget, Lukas
    Cernocky, Jan Honza
    [J]. INTERSPEECH 2022, 2022, : 4955 - 4959
  • [2] Multi-stream acoustic model adaptation for noisy speech recognition
    Tamura, Satoshi
    Hayamizu, Satoru
    [J]. 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [3] Joint learning model for underwater acoustic target recognition
    Tian, Sheng-Zhao
    Chen, Duan-Bing
    Fu, Yan
    Zhou, Jun-Lin
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 260
  • [4] Joint learning model for underwater acoustic target recognition
    Tian, Sheng-Zhao
    Chen, Duan-Bing
    Fu, Yan
    Zhou, Jun-Lin
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 260
  • [5] Two-Stage Multi-Target Joint Learning for Monaural Speech Separation
    Nie, Shuai
    Liang, Shan
    Xue, Wei
    Zhang, Xueliang
    Liu, Wenju
    Dong, Like
    Yang, Hong
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1503 - 1507
  • [6] Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR
    Lin, Yuxiao
    Du, Zhihao
    Zhang, Shiliang
    Yu, Fan
    Zhao, Zhou
    Wu, Fei
    [J]. 2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 150 - 154
  • [7] Robust i-vector based Adaptation of DNN Acoustic Model for Speech Recognition
    Garimella
    Mandal, Arindam
    Strom, Nikko
    Hoffmeister, Bjorn
    Matsoukas, Spyros
    Parthasarathi, Hari Krishnan
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2877 - 2881
  • [8] A Speech Enhancement Neural Network Architecture with SNR-Progressive Multi-Target Learning for Robust Speech Recognition
    Zhou, Nan
    Du, Jun
    Tu, Yan-Hui
    Gao, Tian
    Lee, Chin-Hui
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 873 - 877
  • [9] A Concave Optimization-Based Approach for Joint Multi-Target Track Initialization
    Ji, Ruiping
    Liang, Yan
    Xu, Linfeng
    Zhang, Wanying
    [J]. IEEE ACCESS, 2019, 7 : 108551 - 108560
  • [10] Multi-target ensemble learning based speech enhancement with temporal-spectral structured target
    Wang, Wenbo
    Guo, Weiwei
    Liu, Houguang
    Yang, Jianhua
    Liu, Songyong
    [J]. APPLIED ACOUSTICS, 2023, 205