SEMI-SUPERVISED TRAINING IN LOW-RESOURCE ASR AND KWS

被引:0
|
作者
Metze, Florian [1 ,2 ]
Gandhe, Ankur [1 ,2 ]
Miao, Yajie [1 ,2 ]
Sheikh, Zaid [1 ,2 ]
Wang, Yun [1 ,2 ]
Xu, Di [1 ,2 ]
Zhang, Hao [1 ,2 ]
Kim, Jungsuk [3 ,4 ]
Lane, Ian [3 ,4 ]
Lee, Won Kyum [3 ,4 ]
Stueker, Sebastian [5 ]
Mueller, Markus [5 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Language Technol Inst, Moffett Field, CA USA
[3] Carnegie Mellon Univ, Dept Elect & Comp Engn, Pittsburgh, PA 15213 USA
[4] Carnegie Mellon Univ, Dept Elect & Comp Engn, Moffett Field, CA USA
[5] Karlsruhe Inst Technol, Karlsruhe, Germany
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
基金
美国国家科学基金会;
关键词
spoken term detection; automatic speech recognition; low-resource LTs; semi-supervised training; RECOGNITION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In particular for "low resource" Keyword Search (KWS) and Speech-to-Text (STT) tasks, more untranscribed test data may be available than training data. Several approaches have been proposed to make this data useful during system development, even when initial systems have Word Error Rates (WER) above 70%. In this paper, we present a set of experiments on low-resource languages in telephony speech quality in Assamese, Bengali, Lao, Haitian, Zulu, and Tamil, demonstrating the impact that such techniques can have, in particular learning robust bottle-neck features on the test data. In the case of Tamil, when significantly more test data than training data is available, we integrated semi-supervised training and speaker adaptation on the test data, and achieved significant additional improvements in STT and KWS.
引用
收藏
页码:4699 / 4703
页数:5
相关论文
共 50 条
  • [31] Semi-supervised acoustic model training for five-lingual code-switched ASR
    Biswas, Astik
    Yilmaz, Emre
    de Wet, Febe
    van der Westhuizen, Ewald
    Niesler, Thomas
    INTERSPEECH 2019, 2019, : 3745 - 3749
  • [32] SEMI-SUPERVISED TRANSFER LEARNING FOR LANGUAGE EXPANSION OF END-TO-END SPEECH RECOGNITION MODELS TO LOW-RESOURCE LANGUAGES
    Kim, Jiyeon
    Kumar, Mehul
    Gowda, Dhananjaya
    Garg, Abhinav
    Kim, Chanwoo
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 984 - 988
  • [33] SEMI-SUPERVISED AND TRANSFER LEARNING APPROACHES FOR LOW RESOURCE SENTIMENT CLASSIFICATION
    Gupta, Rahul
    Sahu, Saurabh
    Espy-Wilson, Carol
    Narayanan, Shrikanth
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5109 - 5113
  • [34] Cross-Entropy Training of DNN Ensemble Acoustic Models for Low-Resource ASR
    Sahraeian, Reza
    Van Compernolle, Dirk
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (11) : 1991 - 2001
  • [35] Generative Adversarial Training for Supervised and Semi-supervised Learning
    Wang, Xianmin
    Li, Jing
    Liu, Qi
    Zhao, Wenpeng
    Li, Zuoyong
    Wang, Wenhao
    FRONTIERS IN NEUROROBOTICS, 2021, 15
  • [36] Manifold adversarial training for supervised and semi-supervised learning
    Zhang, Shufei
    Huang, Kaizhu
    Zhu, Jianke
    Liu, Yang
    NEURAL NETWORKS, 2021, 140 : 282 - 293
  • [37] Semi-supervised and Cross-lingual Knowledge Transfer Learnings for DNN Hybrid Acoustic Models under Low-resource Conditions
    Xu, Haihua
    Su, Hang
    Ni, Chongjia
    Xiao, Xiong
    Huang, Hao
    Chng, Eng-Siong
    Li, Haizhou
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1315 - 1319
  • [38] Semi-Supervised Acoustic Model Training by Discriminative Data Selection From Multiple ASR Systems' Hypotheses
    Li, Sheng
    Akita, Yuya
    Kawahara, Tatsuya
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (09) : 1524 - 1534
  • [39] INCORPORATING DISCRIMINATIVE DPGMM POSTERIORGRAMS FOR LOW-RESOURCE ASR
    Wu, Bin
    Sakti, Sakriani
    Nakamura, Satoshi
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 201 - 208
  • [40] Data Augmentation for Low-Resource Quechua ASR Improvement
    Zevallos, Rodolfo
    Bel, Nuria
    Cambara, Guillermo
    Farrus, Mireia
    Luque, Jordi
    INTERSPEECH 2022, 2022, : 3518 - 3522