Joint Optimization of Denoising Autoencoder and DNN Acoustic Model Based on Multi-target Learning for Noisy Speech Recognition

被引:15
|
作者
Mimura, Masato [1 ]
Sakai, Shinsuke [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, Sch Informat, Sakyo Ku, Kyoto 6068501, Japan
关键词
Speech Recognition; Speech Enhancement; Deep Neural Network (DNN); Denoising Autoencoder (DAE); DEEP NEURAL-NETWORKS; ADAPTATION;
D O I
10.21437/Interspeech.2016-388
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Denoising autoencoders (DAEs) have been investigated for enhancing noisy speech before feeding it to the back-end deep neural network (DNN) acoustic model, but there may be a mismatch between the DAE output and the expected input of the back-end DNN, and also inconsistency between the training objective functions of the two networks. In this paper, a joint optimization method of the front-end DAE and the back-end DNN is proposed based on a multi-target learning scheme. In the first step, the front-end DAE is trained with an additional target of minimizing the errors propagated by the back-end DNN. Then, the unified network of DAE and DNN is fine-tuned for the phone state classification target, with an extra target of input speech enhancement imposed to the DAE part. The proposed method has been evaluated with the CHiME3 ASR task, and demonstrated to improve the baseline DNN as well as the simple coupling of DAE with DNN. The method is also effective as a post-filter of a beamformer.
引用
收藏
页码:3803 / 3807
页数:5
相关论文
共 50 条
  • [41] Multi-Self-Supervised Learning Model-Based Throat Microphone Speech Recognition
    Masuda, Kohta
    Ogata, Jun
    Nishida, Masafumi
    Nishimura, Masafumi
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1766 - 1770
  • [42] E2E-based Multi-task Learning Approach to Joint Speech and Accent Recognition
    Zhang, Jicheng
    Peng, Yizhou
    Pham, Van Tung
    Xu, Haihua
    Huang, Hao
    Chng, Eng Siong
    [J]. INTERSPEECH 2021, 2021, : 1519 - 1523
  • [43] Joint Optimization of Multi-UAV Target Assignment and Path Planning Based on Multi-Agent Reinforcement Learning
    Qie, Han
    Shi, Dianxi
    Shen, Tianlong
    Xu, Xinhai
    Li, Yuan
    Wang, Liujing
    [J]. IEEE ACCESS, 2019, 7 : 146264 - 146272
  • [44] JOINT ACOUSTIC MODELING OF TRIPHONES AND TRIGRAPHEMES BY MULTI-TASK LEARNING DEEP NEURAL NETWORKS FOR LOW-RESOURCE SPEECH RECOGNITION
    Chen, Dongpeng
    Mak, Brian
    Leung, Cheung-Chi
    Sivadas, Sunil
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [45] Optimization of Action Recognition Model Based on Multi-Task Learning and Boundary Gradient
    Xu, Yiming
    Zhou, Fangjie
    Wang, Li
    Peng, Wei
    Zhang, Kai
    [J]. ELECTRONICS, 2021, 10 (19)
  • [46] A new joint CTC-attention-based speech recognition model with multi-level multi-head attention
    Chu-Xiong Qin
    Wen-Lin Zhang
    Dan Qu
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2019
  • [47] Acoustic model training using committee-based active and semi-supervised learning for speech recognition
    Tsutaoka, Takuya
    Shinoda, Koichi
    [J]. 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [48] A new joint CTC-attention-based speech recognition model with multi-level multi-head attention
    Qin, Chu-Xiong
    Zhang, Wen-Lin
    Qu, Dan
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2019, 2019 (01)
  • [49] JOINT CTC-ATTENTION BASED END-TO-END SPEECH RECOGNITION USING MULTI-TASK LEARNING
    Kim, Suyoun
    Hori, Takaaki
    Watanabe, Shinji
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4835 - 4839
  • [50] Discriminative feature learning based on multi-view attention network with diffusion joint loss for speech emotion recognition
    Liu, Yang
    Chen, Xin
    Song, Yuan
    Li, Yarong
    Wang, Shengbei
    Yuan, Weitao
    Li, Yongwei
    Zhao, Zhen
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137