Comparison of Regularization Constraints in Deep Neural Network based Speaker Adaptation

被引:0
|
作者
Shen, Peng [1 ]
Lu, Xugang [1 ]
Kawai, Hisashi [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Tokyo, Japan
关键词
Elastic net regularization; Lasso constrained regularization; Speaker adaptation; Deep neural networks;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Adaptation of deep neural network (DNN) acoustic model has been proven to significantly improve the automatic speech recognition (ASR) performance. But how to improve the generalization ability of the adapted model is still a challenge problem. In this study, we investigated algorithms to improve model generalization ability in a parameter regularization framework. Although some regularization algorithms have been proposed, there is no investigation on how the effects of using different regularization constraints in adaptation (e.g., parameter space smoothness or sparsity etc.). We investigated regularization constraints in a lp regularization framework which includes l(1), l(2) regularization, and several constrained forms of them. We carried out the investigation on a lecture speech recognition task. Our investigation showed that most of the regularization constraints could improve the performance but with different parameter updating mechanisms. The regularization constraint which makes the adaptation to pick up only a few model parameters for updating showed the most effective. In addition, by combining different regularization constraints, further improvements could be achieved.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] On the Use of Gaussian Mixture Model Framework to Improve Speaker Adaptation of Deep Neural Network Acoustic Models
    Tomashenko, Natalia
    Khokhlov, Yuri
    Esteve, Yannick
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3788 - 3792
  • [32] SPEAKER DIARIZATION USING DEEP NEURAL NETWORK EMBEDDINGS
    Garcia-Romero, Daniel
    Snyder, David
    Sell, Gregory
    Povey, Daniel
    McCree, Alan
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4930 - 4934
  • [33] ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION
    McLaren, Mitchell
    Lei, Yun
    Ferrer, Luciana
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4814 - 4818
  • [34] Deep Neural Network Approaches to Speaker and Language Recognition
    Richardson, Fred
    Reynolds, Douglas
    Dehak, Najim
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (10) : 1671 - 1675
  • [35] A Unified Deep Neural Network for Speaker and Language Recognition
    Richardson, Fred
    Reynolds, Doug
    Dehak, Najim
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1146 - 1150
  • [36] Inverse Scattering Solver Based on Deep Neural Network With Total Variation Regularization
    Ma, Jie
    Liu, Zicheng
    Zong, Yali
    [J]. IEEE ANTENNAS AND WIRELESS PROPAGATION LETTERS, 2023, 22 (10): : 2447 - 2451
  • [37] Speaker adaptation based on judge network with small adaptation words
    Jeong, JH
    Lee, SY
    [J]. IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL VI, 2000, : 87 - 90
  • [38] A Deep Neural Network Regularization Measure: The Class-Based Decorrelation Method
    Zhang, Chenguang
    Liu, Tian
    Du, Xuejiao
    [J]. ENTROPY, 2024, 26 (01)
  • [39] Speaker Adaptation of Convolutional Neural Network using Speaker Specific Subspace Vectors of SGMM
    Karthick, Murali B.
    Kolhar, Prateek
    Umesh, S.
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1096 - 1100
  • [40] Regularization of Deep Neural Network With Batch Contrastive Loss
    Tanveer, Muhammad
    Tan, Hung-Khoon
    Ng, Hui-Fuang
    Leung, Maylor Karhang
    Chuah, Joon Huang
    [J]. IEEE ACCESS, 2021, 9 : 124409 - 124418