MULTITASK LEARNING AND SYSTEM COMBINATION FOR AUTOMATIC SPEECH RECOGNITION

被引:0
|
作者
Siohan, Olivier [1 ]
Rybach, David [1 ]
机构
[1] Google Inc, New York, NY 10011 USA
关键词
system combination; multitask learning; children's speech; ROVER;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we investigate the performance of an ensemble of convolutional, long short-term memory deep neural networks (CLDNN) on a large vocabulary speech recognition task. To reduce the computational complexity of running multiple recognizers in parallel, we propose instead an early system combination approach which requires the construction of a static decoding network encoding the multiple context-dependent state inventories from the distinct acoustic models. To further reduce the computational load, the hidden units of those models can be shared while keeping the output layers distinct, leading to a multitask training formulation. However in contrast to the traditional multitask training, our formulation uses all predicted outputs leading to a multitask system combination strategy. Results are presented on a Voice Search task designed for children and outperform our current production system.
引用
收藏
页码:589 / 595
页数:7
相关论文
共 50 条
  • [1] End-to-End Audiovisual Speech Recognition System With Multitask Learning
    Tao, Fei
    Busso, Carlos
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 1 - 11
  • [2] Multitask Learning with Local Attention for Tibetan Speech Recognition
    Wang, Hui
    Gao, Fei
    Zhao, Yue
    Yang, Li
    Yue, Jianjian
    Ma, Huilin
    [J]. COMPLEXITY, 2020, 2020
  • [3] Multitask Learning with CTC and Segmental CRF for Speech Recognition
    Lu, Liang
    Kong, Lingpeng
    Dyer, Chris
    Smith, Noah A.
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 954 - 958
  • [4] Automatic Speech Recognition System for Malay Speaking Children Automatic Speech Recognition system
    Rahman, Feisal Dani
    Mohamed, Noraini
    Mustafa, Mumtaz Begum
    Salim, Siti Salwah
    [J]. 2014 THIRD ICT INTERNATIONAL STUDENT PROJECT CONFERENCE (ICT-ISPC), 2014, : 79 - 82
  • [5] AUTOMATIC SPEECH RECOGNITION SYSTEM
    RUSKE, G
    [J]. UMSCHAU IN WISSENSCHAFT UND TECHNIK, 1979, 79 (18) : 566 - 572
  • [6] Active learning for automatic speech recognition
    Hakkani-Tür, D
    Riccardi, G
    Gorin, A
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3904 - 3907
  • [7] Continual Learning in Automatic Speech Recognition
    Sadhu, Samik
    Hermansky, Hynek
    [J]. INTERSPEECH 2020, 2020, : 1246 - 1250
  • [8] A Highly Efficient Distributed Deep Learning System For Automatic Speech Recognition
    Zhang, Wei
    Cui, Xiaodong
    Finkler, Ulrich
    Saon, George
    Kayi, Abdullah
    Buyuktosunoglu, Alper
    Kingsbury, Brian
    Kung, David
    Picheny, Michael
    [J]. INTERSPEECH 2019, 2019, : 2628 - 2632
  • [9] AN AUTOMATIC SPEECH RECOGNITION SYSTEM TABARCA
    BENEDI, JM
    CASACUBERTA, F
    VIDAL, E
    [J]. REVISTA DE INFORMATICA Y AUTOMATICA, 1990, 23 (01): : 15 - 24
  • [10] The AhoSR Automatic Speech Recognition System
    Odriozola, Igor
    Serrano, Luis
    Hernaez, Inma
    Navas, Eva
    [J]. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2014, 2014, 8854 : 279 - 288