Optimization Techniques to Improve Training Speed of Deep Neural Networks for Large Speech Tasks

被引:37
|
作者
Sainath, Tara N. [1 ]
Kingsbury, Brian [1 ]
Soltau, Hagen [1 ]
Ramabhadran, Bhuvana [2 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10567 USA
[2] IBM Res, Multilingual Analyt, Yorktown Hts, NY 10598 USA
关键词
Speech recognition; deep neural networks; parallel optimization techniques;
D O I
10.1109/TASL.2013.2284378
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
While Deep Neural Networks (DNNs) have achieved tremendous success for large vocabulary continuous speech recognition (LVCSR) tasks, training these networks is slow. Even to date, the most common approach to train DNNs is via stochastic gradient descent, serially on one machine. Serial training, coupled with the large number of training parameters (i.e., 10-50 million) and speech data set sizes (i.e., 20-100 million training points) makes DNN training very slow for LVCSR tasks. In this work, we explore a variety of different optimization techniques to improve DNN training speed. This includes parallelization of the gradient computation during cross-entropy and sequence training, as well as reducing the number of parameters in the network using a low-rank matrix factorization. Applying the proposed optimization techniques, we show that DNN training can be sped up by a factor of 3 on a 50-hour English Broadcast News (BN) task with no loss in accuracy. Furthermore, using the proposed techniques, we are able to train DNNs on a 300-hr Switchboard (SWB) task and a 400-hr English BN task, showing improvements between 9-30% relative over a state-of-the art GMM/HMM system while the number of parameters of the DNN is smaller than the GMM/HMM system.
引用
下载
收藏
页码:2267 / 2276
页数:10
相关论文
共 50 条
  • [21] Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition
    Wu, Jibin
    Yilmaz, Emre
    Zhang, Malu
    Li, Haizhou
    Tan, Kay Chen
    FRONTIERS IN NEUROSCIENCE, 2020, 14
  • [22] Retrospective Loss: Looking Back to Improve Training of Deep Neural Networks
    Jandial, Surgan
    Chopra, Ayush
    Sarkar, Mausoom
    Gupta, Piyush
    Krishnamurthy, Balaji
    Balasubramanian, Vineeth
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1123 - 1131
  • [23] SEQUENCE TRAINING OF MULTIPLE DEEP NEURAL NETWORKS FOR BETTER PERFORMANCE AND FASTER TRAINING SPEED
    Zhou, Pan
    Dai, Lirong
    Jiang, Hui
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [24] A Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition
    Toth, Laszlo
    Grosz, Tamas
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 36 - 43
  • [25] Speed up Training of the Recurrent Neural Network Based on Constrained optimization Techniques
    陈珂
    包威权
    迟惠生
    Journal of Computer Science & Technology, 1996, (06) : 581 - 588
  • [26] Speed up training of the recurrent neural network based on constrained optimization techniques
    Peking Univ, Beijing, China
    J Comput Sci Technol, 6 (581-588):
  • [27] The Representation of Speech in Deep Neural Networks
    Scharenborg, Odette
    van der Gouw, Nikki
    Larson, Martha
    Marchiori, Elena
    MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 194 - 205
  • [28] Advanced metaheuristic optimization techniques in applications of deep neural networks: a review
    Mohamed Abd Elaziz
    Abdelghani Dahou
    Laith Abualigah
    Liyang Yu
    Mohammad Alshinwan
    Ahmad M. Khasawneh
    Songfeng Lu
    Neural Computing and Applications, 2021, 33 : 14079 - 14099
  • [29] Advanced metaheuristic optimization techniques in applications of deep neural networks: a review
    Abd Elaziz, Mohamed
    Dahou, Abdelghani
    Abualigah, Laith
    Yu, Liyang
    Alshinwan, Mohammad
    Khasawneh, Ahmad M.
    Lu, Songfeng
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (21): : 14079 - 14099
  • [30] Optimization Techniques for Conversion of Quantization Aware Trained Deep Neural Networks to Lightweight Spiking Neural Networks
    Lee, Kyungchul
    Choi, Sunghyun
    Lew, Dongwoo
    Park, Jongsun
    2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,