IMPROVED NOISY ITERATIVE PSEUDO-LABELING FOR SEMI-SUPERVISED SPEECH RECOGNITION

被引:1
|
作者
Li, Tian [1 ]
Meng, Qingliang [1 ]
Sun, Yujian [1 ]
机构
[1] Shumei AI Res Inst, Beijing, Peoples R China
关键词
pseudo-labeling; semi-supervised learning; end-to-end speech recognition; deep learning;
D O I
10.1109/SLT54892.2023.10022417
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the high annotation cost in ASR, the implementation of semi-supervised training has been a hot issue in research and industry. In a multitude of recent investigations, it has been established that pseudo-labeling, a fundamental sub-direction of semi-supervised learning, is effective in ASR. However, if the iterative PL is utilized, the expense of doing data experiments is prohibitively high, making the promotion to diverse situations of ASR tasks problematic. In this paper, we propose an empirical scoring method based on hypothesis distribution testing to guide iterative PL training, therefore lowering the cost of data experiments and boosting ASR performance. Meanwhile, we conducted extensive experiments to determine the necessity and limitation of model perturbation in the initial training and the PL stages. On the Librispeech 100/860 task, our method improves the 12+6 transformer-based CTC+S2S architecture performance from 4.8%/10.1% to 3.9%/9.6% on test-clean and test-other.
引用
收藏
页码:167 / 173
页数:7
相关论文
共 50 条
  • [21] SRODET: Semi-Supervised Remote Sensing Object Detection With Dynamic Pseudo-Labeling
    Wang, Wenyong
    Cai, Yuanzheng
    Wang, Tao
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2025, 22
  • [22] Toward Effective Semi-supervised Node Classification with Hybrid Curriculum Pseudo-labeling
    Luo, Xiao
    Ju, Wei
    Gu, Yiyang
    Qin, Yifang
    Yi, Siyu
    Wu, Daqing
    Liu, Luchen
    Zhang, Ming
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (03)
  • [23] TENET: Beyond Pseudo-labeling for Semi-supervised Few-shot Learning
    Ma, Chengcheng
    Dong, Weiming
    Xu, Changsheng
    MACHINE INTELLIGENCE RESEARCH, 2025,
  • [24] Pseudo-labeling Algorithm Based on Optimal Transport for Deep Semi-supervised Learning
    Zhai, De-Ming
    Shen, Si-Xian
    Zhou, Xiong
    Jiang, Jun-Jun
    Liu, Xian-Ming
    Ji, Xiang-Yang
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (11): : 5196 - 5209
  • [25] CENTER BASED PSEUDO-LABELING FOR SEMI-SUPERVISED PERSON RE-IDENTIFICATION
    Ding, Guodong
    Zhang, Shanshan
    Khan, Salman
    Tang, Zhenmin
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
  • [26] Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization
    Takashima, Yuki
    Fujita, Yusuke
    Horiguchi, Shota
    Watanabe, Shinji
    Garcia, Paola
    Nagamatsu, Kenji
    INTERSPEECH 2021, 2021, : 3096 - 3100
  • [27] A Semi-Supervised Learning Method for Spiking Neural Networks Based on Pseudo-Labeling
    Nguyen, Thao N. N.
    Veeravalli, Bharadwaj
    Fong, Xuanyao
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [28] PSEUDO-LABELING FOR MASSIVELY MULTILINGUAL SPEECH RECOGNITION
    Lugosch, Loren
    Likhomanenko, Tatiana
    Synnaeve, Gabriel
    Collobert, Ronan
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7687 - 7691
  • [29] Graph Segmentation-Based Pseudo-Labeling for Semi-Supervised Pathology Image Classification
    Shin, Hong-Kyu
    Uhmn, Kwang-Hyun
    Choi, Kyuyeon
    Xu, Zhixin
    Jung, Seung-Won
    Ko, Sung-Jea
    IEEE ACCESS, 2022, 10 : 93960 - 93970
  • [30] SEMI-SUPERVISED 3D OBJECT DETECTION VIA ADAPTIVE PSEUDO-LABELING
    Xu, Hongyi
    Liu, Fengqi
    Zhou, Qianyu
    Hao, Jinkun
    Cao, Zhijie
    Feng, Zhengyang
    Ma, Lizhuang
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3183 - 3187