SPEAKER ADAPTIVE TRAINING FOR DEEP NEURAL NETWORKS EMBEDDING LINEAR TRANSFORMATION NETWORKS

被引:0
|
作者
Ochiai, Tsubasa [1 ,2 ]
Matsuda, Shigeki [2 ]
Watanabe, Hideyuki [1 ]
Lu, Xugang [1 ]
Hori, Chiori [1 ]
Katagiri, Shigeru [2 ]
机构
[1] Natl Inst Informat & Commun Technol, Kyoto, Japan
[2] Doshisha Univ, Grad Sch Engn, Kyoto, Japan
关键词
Speaker Adaptive Training; Deep Neural Network; Linear Transformation Network;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently, a novel speaker adaptation method was proposed that applied the Speaker Adaptive Training (SAT) concept to a speech recognizer consisting of a Deep Neural Network (DNN) and a Hidden Markov Model (HMM), and its utility was demonstrated. This method implements the SAT scheme by allocating one Speaker Dependent (SD) module for each training speaker to one of the intermediate layers of the front-end DNN. It then jointly optimizes the SD modules and the other part of network, which is shared by all the speakers. In this paper, we propose an improved version of the above SAT-based adaptation scheme for a DNN-HMM recognizer. Our new training adopts a Linear Transformation Network (LTN) for the SD module, and such LTN employment leads to more appropriate regularization in both the SAT and adaptation stages by replacing an empirically selected anchorage of a network for regularization in the preceding SAT-DNN-HMM with a SAT-optimized anchorage. We elaborate the effectiveness of our proposed method over TED Talks corpus data. Our experimental results show that a speaker-adapted recognizer using our method achieves a significant word error rate reduction of 9.2 points from a baseline SI-DNN recognizer and also steadily outperforms speaker-adapted recognizers, each of which originates from the preceding SAT-based DNN-HMM.
引用
收藏
页码:4605 / 4609
页数:5
相关论文
共 50 条
  • [1] Embedding-Based Speaker Adaptive Training of Deep Neural Networks
    Cui, Xiaodong
    Goel, Vaibhava
    Saon, George
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 122 - 126
  • [2] SPEAKER ADAPTIVE TRAINING USING DEEP NEURAL NETWORKS
    Ochiai, Tsubasa
    Matsuda, Shigeki
    Lu, Xugang
    Hori, Chiori
    Katagiri, Shigeru
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [3] IMPROVEMENTS TO SPEAKER ADAPTIVE TRAINING OF DEEP NEURAL NETWORKS
    Miao, Yajie
    Jiang, Lu
    Zhang, Hao
    Metze, Florian
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 165 - 170
  • [4] SPEAKER ADAPTIVE TRAINING IN DEEP NEURAL NETWORKS USING SPEAKER DEPENDENT BOTTLENECK FEATURES
    Doddipatla, Rama
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5290 - 5294
  • [5] On Speaker Adaptive Training of Artificial Neural Networks
    Trmal, Jan
    Zelinka, Jan
    Mueller, Ludek
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 554 - 557
  • [6] IMPROVED SPEAKER INDEPENDENT LIP READING USING SPEAKER ADAPTIVE TRAINING AND DEEP NEURAL NETWORKS
    Almajai, Ibrahim
    Cox, Stephen
    Harvey, Richard
    Lan, Yuxuan
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 2722 - 2726
  • [7] Fast speaker adaptation using extended diagonal linear transformation for deep neural networks
    Kim, Donghyun
    Kim, Sanghun
    [J]. ETRI JOURNAL, 2019, 41 (01) : 109 - 116
  • [8] A fast adaptive algorithm for training deep neural networks
    Yangting Gui
    Dequan Li
    Runyue Fang
    [J]. Applied Intelligence, 2023, 53 : 4099 - 4108
  • [9] A fast adaptive algorithm for training deep neural networks
    Gui, Yangting
    Li, Dequan
    Fang, Runyue
    [J]. APPLIED INTELLIGENCE, 2023, 53 (04) : 4099 - 4108
  • [10] An adaptive embedding procedure for time series forecasting with deep neural networks
    Succetti, Federico
    Rosato, Antonello
    Panella, Massimo
    [J]. NEURAL NETWORKS, 2023, 167 : 715 - 729