IMPROVED NOISY ITERATIVE PSEUDO-LABELING FOR SEMI-SUPERVISED SPEECH RECOGNITION

被引:1
|
作者
Li, Tian [1 ]
Meng, Qingliang [1 ]
Sun, Yujian [1 ]
机构
[1] Shumei AI Res Inst, Beijing, Peoples R China
关键词
pseudo-labeling; semi-supervised learning; end-to-end speech recognition; deep learning;
D O I
10.1109/SLT54892.2023.10022417
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the high annotation cost in ASR, the implementation of semi-supervised training has been a hot issue in research and industry. In a multitude of recent investigations, it has been established that pseudo-labeling, a fundamental sub-direction of semi-supervised learning, is effective in ASR. However, if the iterative PL is utilized, the expense of doing data experiments is prohibitively high, making the promotion to diverse situations of ASR tasks problematic. In this paper, we propose an empirical scoring method based on hypothesis distribution testing to guide iterative PL training, therefore lowering the cost of data experiments and boosting ASR performance. Meanwhile, we conducted extensive experiments to determine the necessity and limitation of model perturbation in the initial training and the PL stages. On the Librispeech 100/860 task, our method improves the 12+6 transformer-based CTC+S2S architecture performance from 4.8%/10.1% to 3.9%/9.6% on test-clean and test-other.
引用
收藏
页码:167 / 173
页数:7
相关论文
共 50 条
  • [1] Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
    Higuchi, Yosuke
    Moritz, Niko
    Le Roux, Jonathan
    Hori, Takaaki
    INTERSPEECH 2021, 2021, : 726 - 730
  • [2] Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
    Zhu H.
    Gao D.
    Cheng G.
    Povey D.
    Zhang P.
    Yan Y.
    IEEE/ACM Transactions on Audio Speech and Language Processing, 2023, 31 : 3320 - 3330
  • [3] Iterative Pseudo-Labeling for Speech Recognition
    Xu, Qiantong
    Likhomanenko, Tatiana
    Kahn, Jacob
    Hannun, Awni
    Synnaeve, Gabriel
    Collobert, Ronan
    INTERSPEECH 2020, 2020, : 1006 - 1010
  • [4] Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition
    Xu, Yinghao
    Wei, Fangyun
    Sun, Xiao
    Yang, Ceyuan
    Shen, Yujun
    Dai, Bo
    Zhou, Bolei
    Lin, Stephen
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2949 - 2958
  • [5] Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning
    Cascante-Bonilla, Paola
    Tan, Fuwen
    Qi, Yanjun
    Ordonez, Vicente
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 6912 - 6920
  • [6] Semi-Supervised Multimodal Emotion Recognition with Class-Balanced Pseudo-Labeling
    Chen, Haifeng
    Guo, Chujia
    Li, Yan
    Zhang, Peng
    Jiang, Dongmei
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9556 - 9560
  • [7] Compressed video ensemble based pseudo-labeling for semi-supervised action recognition
    Terao, Hayato
    Noguchi, Wataru
    Iizuka, Hiroyuki
    Yamamoto, Masahito
    MACHINE LEARNING WITH APPLICATIONS, 2022, 9
  • [8] A Pseudo-labeling Approach to Semi-supervised Organ Segmentation
    Gao, Jianwei
    Xu, Juan
    Fei, Honggao
    FAST AND LOW-RESOURCE SEMI-SUPERVISED ABDOMINAL ORGAN SEGMENTATION, FLARE 2022, 2022, 13816 : 318 - 326
  • [9] Spatial pseudo-labeling for semi-supervised facies classification
    Asghar, Saleem
    Choi, Junhwan
    Yoon, Daeung
    Byun, Joongmoo
    JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2020, 195
  • [10] Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling
    Feng, Tiantian
    Narayanan, Shrikanth
    INTERSPEECH 2022, 2022, : 5050 - 5054