How to train a discriminative front end with stochastic gradient descent and maximum mutual information

被引:0
|
作者
Droppo, J [1 ]
Mahajan, M [1 ]
Gunawardana, A [1 ]
Acero, A [1 ]
机构
[1] Microsoft Res, Speech Technol Grp, Redmond, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a general discriminative training method for the front end of an automatic speech recognition system. The SPLICE parameters of the front end are trained using stochastic gradient descent (SGD) of a maximum mutual information (MMI) objective function. SPLICE is chosen for its ability to approximate both linear and non-linear transformations of the feature space. SGD is chosen for its simplicity of implementation. Results are presented on both the Aurora 2 small vocabulary task and the WSJ Nov-92 medium vocabulary task. It is shown that the discriminative front end is able to consistently increase system accuracy across different front end configurations and tasks.
引用
收藏
页码:41 / 46
页数:6
相关论文
共 30 条
  • [1] ON STOCHASTIC GRADIENT DESCENT AND QUADRATIC MUTUAL INFORMATION FOR IMAGE REGISTRATION
    Singh, Abhishek
    Ahuja, Narendra
    [J]. 2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 1326 - 1330
  • [2] Stochastic gradient identification of Wiener system with maximum mutual information criterion
    Chen, B.
    Zhu, Y.
    Hu, J.
    Principe, J. C.
    [J]. IET SIGNAL PROCESSING, 2011, 5 (06) : 589 - 597
  • [3] Train faster, generalize better: Stability of stochastic gradient descent
    Hardt, Moritz
    Recht, Benjamin
    Singer, Yoram
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [4] DISCRIMINATIVE FEATURE TRANSFORMS USING DIFFERENCED MAXIMUM MUTUAL INFORMATION
    Delcroix, Marc
    Ogawa, Atsunori
    Watanabe, Shinji
    Nakatani, Tomohiro
    Nakamura, Atsushi
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4753 - 4756
  • [5] Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks
    Vasudevan, Shrihari
    [J]. ENTROPY, 2020, 22 (05)
  • [6] MUTUAL-INFORMATION-PRIVATE ONLINE GRADIENT DESCENT ALGORITHM
    Zhang, Ruochi
    Venkitasubramaniam, Parv
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2077 - 2081
  • [7] Discriminative training of GMM based on Maximum Mutual Information for language identification
    Qu Dan
    Wang Bingxi
    Yan Honggang
    Dai Guannan
    [J]. WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 1576 - +
  • [8] On Maximum a Posteriori Estimation with Plug & Play Priors and Stochastic Gradient Descent
    Rémi Laumont
    Valentin De Bortoli
    Andrés Almansa
    Julie Delon
    Alain Durmus
    Marcelo Pereyra
    [J]. Journal of Mathematical Imaging and Vision, 2023, 65 : 140 - 163
  • [9] On Maximum a Posteriori Estimation with Plug & Play Priors and Stochastic Gradient Descent
    Laumont, Remi
    De Bortoli, Valentin
    Almansa, Andres
    Delon, Julie
    Durmus, Alain
    Pereyra, Marcelo
    [J]. JOURNAL OF MATHEMATICAL IMAGING AND VISION, 2023, 65 (1) : 140 - 163
  • [10] Gradient of the Mutual Information in Stochastic Systems: A Functional Approach
    Sedighizad, Mahboobeh
    Seyfe, Babak
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (10) : 1521 - 1525