Optimization Techniques to Improve Training Speed of Deep Neural Networks for Large Speech Tasks

被引:37
|
作者
Sainath, Tara N. [1 ]
Kingsbury, Brian [1 ]
Soltau, Hagen [1 ]
Ramabhadran, Bhuvana [2 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10567 USA
[2] IBM Res, Multilingual Analyt, Yorktown Hts, NY 10598 USA
关键词
Speech recognition; deep neural networks; parallel optimization techniques;
D O I
10.1109/TASL.2013.2284378
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
While Deep Neural Networks (DNNs) have achieved tremendous success for large vocabulary continuous speech recognition (LVCSR) tasks, training these networks is slow. Even to date, the most common approach to train DNNs is via stochastic gradient descent, serially on one machine. Serial training, coupled with the large number of training parameters (i.e., 10-50 million) and speech data set sizes (i.e., 20-100 million training points) makes DNN training very slow for LVCSR tasks. In this work, we explore a variety of different optimization techniques to improve DNN training speed. This includes parallelization of the gradient computation during cross-entropy and sequence training, as well as reducing the number of parameters in the network using a low-rank matrix factorization. Applying the proposed optimization techniques, we show that DNN training can be sped up by a factor of 3 on a 50-hour English Broadcast News (BN) task with no loss in accuracy. Furthermore, using the proposed techniques, we are able to train DNNs on a 300-hr Switchboard (SWB) task and a 400-hr English BN task, showing improvements between 9-30% relative over a state-of-the art GMM/HMM system while the number of parameters of the DNN is smaller than the GMM/HMM system.
引用
下载
收藏
页码:2267 / 2276
页数:10
相关论文
共 50 条
  • [1] Deep Convolutional Neural Networks for Large-scale Speech Tasks
    Sainath, Tara N.
    Kingsbury, Brian
    Saon, George
    Soltau, Hagen
    Mohamed, Abdel-rahman
    Dahl, George
    Ramabhadran, Bhuvana
    NEURAL NETWORKS, 2015, 64 : 39 - 48
  • [2] Training Maxout Neural Networks for Speech Recognition Tasks
    Prudnikov, Aleksey
    Korenevsky, Maxim
    TEXT, SPEECH, AND DIALOGUE, 2016, 9924 : 443 - 451
  • [3] A COMPARISON OF TWO OPTIMIZATION TECHNIQUES FOR SEQUENCE DISCRIMINATIVE TRAINING OF DEEP NEURAL NETWORKS
    Saon, George
    Soltau, Hagen
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [4] Noisy training for deep neural networks in speech recognition
    Shi Yin
    Chao Liu
    Zhiyong Zhang
    Yiye Lin
    Dong Wang
    Javier Tejedor
    Thomas Fang Zheng
    Yinguo Li
    EURASIP Journal on Audio, Speech, and Music Processing, 2015
  • [5] Noisy training for deep neural networks in speech recognition
    Yin, Shi
    Liu, Chao
    Zhang, Zhiyong
    Lin, Yiye
    Wang, Dong
    Tejedor, Javier
    Zheng, Thomas Fang
    Li, Yinguo
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015, : 1 - 14
  • [6] FAST TRAINING OF DEEP NEURAL NETWORKS FOR SPEECH RECOGNITION
    Gong, Guojing
    Kingsbury, Brian
    Yang, Chih-Chieh
    Liu, Tianyi
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6884 - 6888
  • [7] An introduction to distributed training of deep neural networks for segmentation tasks with large seismic data sets
    Birnie, Claire
    Jarraya, Haithem
    Hansteen, Fredrik
    GEOPHYSICS, 2021, 86 (06) : KS151 - KS160
  • [8] An Optimization Strategy for Deep Neural Networks Training
    Wu, Tingting
    Zeng, Peng
    Song, Chunhe
    2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 596 - 603
  • [9] An Analysis of Instance Selection for Neural Networks to Improve Training Speed
    Sun, Xunhu
    Chan, Philip K.
    2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, : 288 - 293
  • [10] Parallel nonlinear optimization techniques for training neural networks
    Phua, PKH
    Ming, DH
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2003, 14 (06): : 1460 - 1468