MULTI-LINGUAL SPEECH RECOGNITION WITH LOW-RANK MULTI-TASK DEEP NEURAL NETWORKS

被引:0
|
作者
Mohan, Aanchan [1 ]
Rose, Richard [1 ]
机构
[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ, Canada
关键词
Low-resource speech recognition; Multi-lingual speech recognition; Neural Networks for speech recognition; Multi-task Learning;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Multi-task learning (MTL) for deep neural network (DNN) multilingual acoustic models has been shown to be effective for learning parameters that are common or shared between multiple languages[1, 2]. In the MTL paradigm, the number of parameters in the output layer is large and scales with the number of languages used in training. This output layer becomes a computational bottleneck. For mono-lingual DNNs, low-rank matrix factorization (LRMF) of weight matrices have yielded large computational savings[3, 4]. The LRMF proposed in this work for MTL, is for the original language-specific block matrices to "share" a common matrix, with resulting low-rank language specific block matrices. The impact of LRMF is presented in two scenarios, namely : (a) improving performance in a target language when auxiliary languages are included during multi-lingual training; and (b) cross-language transfer to an unseen language with only 1 hour of transcribed training data. A 44% parameter reduction in the final layer, manifests itself in providing a lower memory footprint and faster training times. An experimental study shows that the LRMF multi-lingual DNN provides competitive performance compared to a full-rank multi-lingual DNN in both scenarios.
引用
收藏
页码:4994 / 4998
页数:5
相关论文
共 50 条
  • [31] Multi-lingual Transformer Training for Khmer Automatic Speech Recognition
    Soky, Kak
    Li, Sheng
    Kawahara, Tatsuya
    Seng, Sopheap
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1893 - 1896
  • [32] Neural Networks for Multi-lingual Multi-label Document Classification
    Martinek, Jiri
    Lenc, Ladislav
    Kral, Pavel
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I, 2018, 11139 : 73 - 83
  • [33] Multi-task Sparse Low-Rank Learning for Multi-classification of Parkinson's Disease
    Lei, Haijun
    Zhao, Yujia
    Lei, Baiying
    [J]. DEEP LEARNING IN MEDICAL IMAGE ANALYSIS AND MULTIMODAL LEARNING FOR CLINICAL DECISION SUPPORT, DLMIA 2018, 2018, 11045 : 361 - 368
  • [34] Multi-lingual interoperability in speech technology
    Steeneken, HJM
    [J]. SPEECH COMMUNICATION, 2001, 35 (1-2) : 1 - 3
  • [35] Multi-Adaptive Optimization for multi-task learning with deep neural networks
    Hervella, alvaro S.
    Rouco, Jose
    Novo, Jorge
    Ortega, Marcos
    [J]. NEURAL NETWORKS, 2024, 170 : 254 - 265
  • [36] Multi-Task Deep Neural Networks for Multi-Document Reading Comprehension
    Liu, Chang
    Liu, Zhuang
    Lin, Wayne
    Zhao, Jun
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [37] Deep Convolutional Neural Networks for Multi-Instance Multi-Task Learning
    Zeng, Tao
    Ji, Shuiwang
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 579 - 588
  • [38] Transfer Learning from Multi-Lingual Speech Translation Benefits Low-Resource Speech Recognition
    Vanderreydt, Geoffroy
    Remy, Francois
    Demuynck, Kris
    [J]. INTERSPEECH 2022, 2022, : 3053 - 3057
  • [39] MULTI-TASK LOW-RANK AND SPARSE MATRIX RECOVERY FOR HUMAN MOTION SEGMENTATION
    Wang, Xiangyang
    Wan, Wanggen
    Liu, Guangcan
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 897 - 900
  • [40] Findings of the 1st Shared Task on Multi-lingual Multi-task Information Retrieval at MRL 2023
    Tinner, Francesco
    Adelani, David Ifeoluwa
    Emezue, Chris
    Hajili, Mammad
    Goldman, Omer
    Adilazuarda, Muhammad Farid
    Al Kautsar, Muhammad Dehan
    Mirsaidova, Aziza
    Kural, Müge
    Massey, Dylan
    Chukwuneke, Chiamaka
    Mbonu, Chinedu
    Oloyede, Damilola Oluwaseun
    Olaleye, Kayode
    Atala, Jonathan
    Ajibade, Benjamin A.
    Bassi, Saksham
    Aralikatte, Rahul
    Kim, Najoung
    Ataman, Duygu
    [J]. MRL 2023 - 3rd Workshop on Multi-Lingual Representation Learning, Proceedings of the Workshop, 2023, : 106 - 117