Multilingual Data Selection For Low Resource Speech Recognition

被引:9
|
作者
Thomas, Samuel [1 ]
Audhkhasi, Kartik [1 ]
Cui, Jia [1 ]
Kingsbury, Brian [1 ]
Ramabhadran, Bhuvana [1 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
Multilingual features; acoustic models; deep neural networks; low resource speech recognition;
D O I
10.21437/Interspeech.2016-598
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Feature representations extracted from deep neural network based multilingual frontends provide significant improvements to speech recognition systems in low resource settings. To effectively train these frontends, we introduce a data selection technique that discovers language groups from an available set of training languages. This data selection method reduces the required amount of training data and training time by approximately 40%, with minimal performance degradation. We present speech recognition results on 7 very limited language pack (VLLP) languages from the second option period of the IARPA Babel program using multilingual features trained on up to 10 languages. The proposed multilingual features provide up to 15% relative improvement over baseline acoustic features on the VLLP languages.
引用
收藏
页码:3853 / 3857
页数:5
相关论文
共 50 条
  • [1] MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION
    Li, Xinjian
    Mortensen, David R.
    Metze, Florian
    Black, Alan W.
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6958 - 6962
  • [2] ADVERSARIAL MULTILINGUAL TRAINING FOR LOW-RESOURCE SPEECH RECOGNITION
    Yi, Jiangyan
    Tao, Jianhua
    Wen, Zhengqi
    Bai, Ye
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4899 - 4903
  • [3] MULTILINGUAL REPRESENTATIONS FOR LOW RESOURCE SPEECH RECOGNITION AND KEYWORD SEARCH
    Cui, Jia
    Kingsbury, Brian
    Ramabhadran, Bhuvana
    Sethy, Abhinav
    Audhkhasi, Kartik
    Cui, Xiaodong
    Kislal, Ellen
    Mangu, Lidia
    Nussbaum-Thom, Markus
    Picheny, Michael
    Tueske, Zoltan
    Golik, Pavel
    Schlueter, Ralf
    Ney, Hermann
    Gales, Mark J. F.
    Knill, Kate M.
    Ragni, Anton
    Wang, Haipeng
    Woodland, Phil
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 259 - 266
  • [4] Adaptive Activation Network for Low Resource Multilingual Speech Recognition
    Luo, Jian
    Wang, Jianzong
    Cheng, Ning
    Zheng, Zhenpeng
    Xiao, Jing
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [5] Multilingual acoustic models for speech recognition in low-resource devices
    Garcia, Enrique Gil
    Mengusoglu, Erhan
    Janke, Eric
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 981 - +
  • [6] Adversarial Meta Sampling for Multilingual Low-Resource Speech Recognition
    Xiao, Yubei
    Gong, Ke
    Zhou, Pan
    Zheng, Guolin
    Liang, Xiaodan
    Lin, Liang
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14112 - 14120
  • [7] Web Data Selection Based on Word Embedding for Low-Resource Speech Recognition
    Xie, Chuandong
    Guo, Wu
    Hu, Guoping
    Liu, Junhua
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1340 - 1344
  • [8] Multilingual Meta-Transfer Learning for Low-Resource Speech Recognition
    Zhou, Rui
    Koshikawa, Takaki
    Ito, Akinori
    Nose, Takashi
    Chen, Chia-Ping
    [J]. IEEE Access, 2024, 12 : 158493 - 158504
  • [9] Articulatory Feature based Multilingual MLPs for Low-Resource Speech Recognition
    Qian, Yanmin
    Liu, Jia
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2601 - 2604
  • [10] Optimized data selection strategy based unsupervised acoustic modeling for low data resource speech recognition
    Qian, Yanmin
    Liu, Jia
    [J]. Qinghua Daxue Xuebao/Journal of Tsinghua University, 2013, 53 (07): : 1001 - 1004