MULTILINGUAL ACOUSTIC MODELS USING DISTRIBUTED DEEP NEURAL NETWORKS

被引:0
|
作者
Heigold, G. [1 ]
Vanhoucke, V. [1 ]
Senior, A. [1 ]
Nguyen, P. [1 ]
Ranzato, M. [1 ]
Devin, M. [1 ]
Dean, J. [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
关键词
Speech recognition; parameter sharing; deep neural networks; multilingual training; distributed neural networks;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Today's speech recognition technology is mature enough to be useful for many practical applications. In this context, it is of paramount importance to train accurate acoustic models for many languages within given resource constraints such as data, processing power, and time. Multilingual training has the potential to solve the data issue and close the performance gap between resource-rich and resource-scarce languages. Neural networks lend themselves naturally to parameter sharing across languages, and distributed implementations have made it feasible to train large networks. In this paper, we present experimental results for cross-and multi-lingual network training of eleven Romance languages on 10k hours of data in total. The average relative gains over the monolingual baselines are 4%/2% (data-scarce/data-rich languages) for cross-and 7%/2% for multi-lingual training. However, the additional gain from jointly training the languages on all data comes at an increased training time of roughly four weeks, compared to two weeks (monolingual) and one week (crosslingual).
引用
收藏
页码:8619 / 8623
页数:5
相关论文
共 50 条
  • [1] IMPROVING DEEP NEURAL NETWORK ACOUSTIC MODELS USING GENERALIZED MAXOUT NETWORKS
    Zhang, Xiaohui
    Trmal, Jan
    Povey, Daniel
    Khudanpur, Sanjeev
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [2] MULTILINGUAL TRAINING OF DEEP NEURAL NETWORKS
    Ghoshal, Arnab
    Swietojanski, Pawel
    Renals, Steve
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7319 - 7323
  • [3] Neural Language Codes for Multilingual Acoustic Models
    Muller, Markus
    Stuker, Sebastian
    Waibel, Alex
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2419 - 2423
  • [4] RECOGNITION OF ACOUSTIC EVENTS USING DEEP NEURAL NETWORKS
    Gencoglu, Oguzhan
    Virtanen, Tuomas
    Huttunen, Heikki
    [J]. 2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 506 - 510
  • [5] Learning Waveform-Based Acoustic Models Using Deep Variational Convolutional Neural Networks
    Oglic, Dino
    Cvetkovic, Zoran
    Sollich, Peter
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2850 - 2863
  • [6] VERY DEEP MULTILINGUAL CONVOLUTIONAL NEURAL NETWORKS FOR LVCSR
    Sercu, Tom
    Puhrsch, Christian
    Kingsbury, Brian
    LeCun, Yann
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 4955 - 4959
  • [7] Robust acoustic event classification using deep neural networks
    Sharan, Roneel V.
    Moir, Tom J.
    [J]. INFORMATION SCIENCES, 2017, 396 : 24 - 32
  • [8] Analysis of Gastrointestinal Acoustic Activity Using Deep Neural Networks
    Ficek, Jakub
    Radzikowski, Kacper
    Nowak, Jan Krzysztof
    Yoshie, Osamu
    Walkowiak, Jaroslaw
    Nowak, Robert
    [J]. SENSORS, 2021, 21 (22)
  • [9] Distinct Triphone Acoustic Modeling Using Deep Neural Networks
    Chen, Dongpeng
    Mak, Brian
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2645 - 2649
  • [10] Using Graphical Models as Explanations in Deep Neural Networks
    Le, Franck
    Srivatsa, Mudhakar
    Reddy, Krishna Kesari
    Roy, Kaushik
    [J]. 2019 IEEE 16TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SMART SYSTEMS (MASS 2019), 2019, : 283 - 289