Multilingual Data Selection For Low Resource Speech Recognition

被引:9
|
作者
Thomas, Samuel [1 ]
Audhkhasi, Kartik [1 ]
Cui, Jia [1 ]
Kingsbury, Brian [1 ]
Ramabhadran, Bhuvana [1 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
Multilingual features; acoustic models; deep neural networks; low resource speech recognition;
D O I
10.21437/Interspeech.2016-598
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Feature representations extracted from deep neural network based multilingual frontends provide significant improvements to speech recognition systems in low resource settings. To effectively train these frontends, we introduce a data selection technique that discovers language groups from an available set of training languages. This data selection method reduces the required amount of training data and training time by approximately 40%, with minimal performance degradation. We present speech recognition results on 7 very limited language pack (VLLP) languages from the second option period of the IARPA Babel program using multilingual features trained on up to 10 languages. The proposed multilingual features provide up to 15% relative improvement over baseline acoustic features on the VLLP languages.
引用
下载
收藏
页码:3853 / 3857
页数:5
相关论文
共 50 条
  • [21] Task-based Meta Focal Loss for Multilingual Low-resource Speech Recognition
    Chen, Yaqi
    Zhang, Wenlin
    Zhang, Hao
    Qu, Dan
    Yang, Xu-Kui
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (11)
  • [22] Data selection for speech recognition
    Wu, Yi
    Zhang, Rong
    Rudnicky, Alexander
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 562 - 565
  • [23] Evolutionary feature selection for emotion recognition in multilingual speech analysis
    Brester, Christina
    Semenkin, Eugene
    Kovalev, Igor
    Zelenkov, Pavel
    Sidorov, Maxim
    2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2015, : 2406 - 2411
  • [24] MIXSPEECH: DATA AUGMENTATION FOR LOW-RESOURCE AUTOMATIC SPEECH RECOGNITION
    Meng, Linghui
    Xu, Jin
    Tan, Xu
    Wang, Jindong
    Qin, Tao
    Xu, Bo
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7008 - 7012
  • [25] EXPLORING EFFECTIVE DATA UTILIZATION FOR LOW-RESOURCE SPEECH RECOGNITION
    Zhou, Zhikai
    Wang, Wei
    Zhang, Wangyou
    Qian, Yanmin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8192 - 8196
  • [26] Improving Low Resource Turkish Speech Recognition with Data Augmentation and TTS
    Gokay, Ramazan
    Yalcin, Hulya
    2019 16TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2019, : 357 - 360
  • [27] Cross-Lingual Self-training to Learn Multilingual Representation for Low-Resource Speech Recognition
    Zi-Qiang Zhang
    Yan Song
    Ming-Hui Wu
    Xin Fang
    Ian McLoughlin
    Li-Rong Dai
    Circuits, Systems, and Signal Processing, 2022, 41 : 6827 - 6843
  • [28] Cross-Lingual Self-training to Learn Multilingual Representation for Low-Resource Speech Recognition
    Zhang, Zi-Qiang
    Song, Yan
    Wu, Ming-Hui
    Fang, Xin
    McLoughlin, Ian
    Dai, Li-Rong
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (12) : 6827 - 6843
  • [29] A Comparative Study of BNF and DNN Multilingual Training on Cross-lingual Low-resource Speech Recognition
    Xu, Haihua
    Van Hai Do
    Xiao, Xiong
    Chng, Eng-Siong
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2132 - 2136
  • [30] Multilingual Convolutional, Long Short-Term Memory, Deep Neural Networks for Low Resource Speech Recognition
    Bukhari, Danish
    Wang, Yutian
    Wang, Hui
    ADVANCES IN INFORMATION AND COMMUNICATION TECHNOLOGY, 2017, 107 : 842 - 847