MULTIFRAME DEEP NEURAL NETWORKS FOR ACOUSTIC MODELING

被引:0
|
作者
Vanhoucke, Vincent [1 ]
Devin, Matthieu [1 ]
Heigold, Georg [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
关键词
deep neural networks; acoustic modeling;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep neural networks have been shown to perform very well as acoustic models for automatic speech recognition. Compared to Gaussian mixtures however, they tend to be very expensive computationally, making them challenging to use in real-time applications. One key advantage of such neural networks is their ability to learn from very long observation windows going up to 400 ms. Given this very long temporal context, it is tempting to wonder whether one can run neural networks at a lower frame rate than the typical 10 ms, and whether there might be computational benefits to doing so. This paper describes a method of tying the neural network parameters over time which achieves comparable performance to the typical frame-synchronous model, while achieving up to a 4X reduction in the computational cost of the neural network activations.
引用
收藏
页码:7582 / 7585
页数:4
相关论文
共 50 条
  • [1] Deep Neural Networks for Acoustic Modeling in Speech Recognition
    Hinton, Geoffrey
    Deng, Li
    Yu, Dong
    Dahl, George E.
    Mohamed, Abdel-rahman
    Jaitly, Navdeep
    Senior, Andrew
    Vanhoucke, Vincent
    Patrick Nguyen
    Sainath, Tara N.
    Kingsbury, Brian
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 82 - 97
  • [2] Deep Neural Networks for Acoustic Modeling in the Presence of Noise
    Santana, L. M. Q. D.
    Santos, R. M.
    Matos, L. N.
    Macedo, H. T.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2018, 16 (03) : 918 - 925
  • [3] Modular Combination of Deep Neural Networks for Acoustic Modeling
    Gehring, Jonas
    Lee, Wonkyum
    Kilgour, Kevin
    Lane, Ian
    Miao, Yaije
    Waibel, Alex
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 94 - 98
  • [4] Distinct Triphone Acoustic Modeling Using Deep Neural Networks
    Chen, Dongpeng
    Mak, Brian
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2645 - 2649
  • [5] DEEP CONVOLUTIONAL NEURAL NETWORKS FOR ACOUSTIC MODELING IN LOW RESOURCE LANGUAGES
    Chan, William
    Lane, Ian
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2056 - 2060
  • [6] Acoustic Modeling with Deep Neural Networks Using Raw Time Signal for LVCSR
    Tueske, Zoltan
    Golik, Pavel
    Schluter, Ralf
    Ney, Hermann
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 890 - 894
  • [7] Deep Neural Networks for Syllable based Acoustic Modeling in Chinese Speech Recognition
    Li, Xiangang
    Hong, Caifu
    Yang, Yuning
    Wu, Xihong
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [8] Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling
    Yi, Yuan-Hao
    Ai, Yang
    Ling, Zhen-Hua
    Dai, Li-Rong
    [J]. INTERSPEECH 2019, 2019, : 2593 - 2597
  • [9] Improving Russian LVCSR Using Deep Neural Networks for Acoustic and Language Modeling
    Kipyatkova, Irina
    [J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 291 - 300
  • [10] Acoustic Modeling Using Deep Belief Networks
    Mohamed, Abdel-rahman
    Dahl, George E.
    Hinton, Geoffrey
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01): : 14 - 22