IMPROVED NOISY ITERATIVE PSEUDO-LABELING FOR SEMI-SUPERVISED SPEECH RECOGNITION

被引:1
|
作者
Li, Tian [1 ]
Meng, Qingliang [1 ]
Sun, Yujian [1 ]
机构
[1] Shumei AI Res Inst, Beijing, Peoples R China
关键词
pseudo-labeling; semi-supervised learning; end-to-end speech recognition; deep learning;
D O I
10.1109/SLT54892.2023.10022417
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the high annotation cost in ASR, the implementation of semi-supervised training has been a hot issue in research and industry. In a multitude of recent investigations, it has been established that pseudo-labeling, a fundamental sub-direction of semi-supervised learning, is effective in ASR. However, if the iterative PL is utilized, the expense of doing data experiments is prohibitively high, making the promotion to diverse situations of ASR tasks problematic. In this paper, we propose an empirical scoring method based on hypothesis distribution testing to guide iterative PL training, therefore lowering the cost of data experiments and boosting ASR performance. Meanwhile, we conducted extensive experiments to determine the necessity and limitation of model perturbation in the initial training and the PL stages. On the Librispeech 100/860 task, our method improves the 12+6 transformer-based CTC+S2S architecture performance from 4.8%/10.1% to 3.9%/9.6% on test-clean and test-other.
引用
收藏
页码:167 / 173
页数:7
相关论文
共 50 条
  • [31] Pseudo-Labeling Optimization Based Ensemble Semi-Supervised Soft Sensor in the Process Industry
    Li, Youwei
    Jin, Huaiping
    Dong, Shoulong
    Yang, Biao
    Chen, Xiangguang
    SENSORS, 2021, 21 (24)
  • [32] Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection
    Zhang, Jiacheng
    Li, Jiaming
    Lin, Xiangru
    Zhang, Wei
    Tang, Xiao
    Hang, Junyu
    Ding, Errui
    Wang, Jingdong
    Li, Guanbin
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 16923 - 16932
  • [33] FaxMatch: Multi-Curriculum Pseudo-Labeling for semi-supervised medical image classification
    Peng, Zhen
    Zhang, Dezhi
    Tian, Shengwei
    Wu, Weidong
    Yu, Long
    Zhou, Shaofeng
    Huang, Shanhang
    MEDICAL PHYSICS, 2023, 50 (05) : 3210 - 3222
  • [34] Uncertainty-Inspired Credible Pseudo-Labeling in Semi-Supervised Medical Image Segmentation
    Zheng, Zhiyu
    Lv, Liang
    Ni, Bo
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XIV, 2025, 15044 : 90 - 104
  • [35] Refined Semi-Supervised Modulation Classification: Integrating Consistency Regularization and Pseudo-Labeling Techniques
    Ma, Min
    Liu, Shanrong
    Wang, Shufei
    Shi, Shengnan
    FUTURE INTERNET, 2024, 16 (02)
  • [36] Spatial pseudo-labeling for semi-supervised facies classification (vol 195, 107834, 2020)
    Asghar, Saleem
    Choi, Junhwan
    Yoon, Daeung
    Byun, Joongmoo
    JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2021, 198
  • [37] JointMatch: A Unified Approach for Diverse and Collaborative Pseudo-Labeling to Semi-Supervised Text Classification
    Zou, Henry Peng
    Caragea, Cornelia
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 7290 - 7301
  • [38] Semi-supervised Two-Stage Abdominal Organ and Tumor Segmentation Model with Pseudo-labeling
    Mao, Li
    FAST, LOW-RESOURCE, AND ACCURATE ORGAN AND PAN-CANCER SEGMENTATION IN ABDOMEN CT, FLARE 2023, 2024, 14544 : 63 - 75
  • [39] P-PseudoLabel: Enhanced Pseudo-Labeling Framework With Network Pruning in Semi-Supervised Learning
    Ham, Gyeongdo
    Cho, Yucheol
    Lee, Jae-Hyeok
    Kim, Daeshik
    IEEE ACCESS, 2022, 10 : 115652 - 115662
  • [40] P-PseudoLabel: Enhanced Pseudo-Labeling Framework With Network Pruning in Semi-Supervised Learning
    Ham, Gyeongdo
    Cho, Yucheol
    Lee, Jae-Hyeok
    Kim, Daeshik
    IEEE Access, 2022, 10 : 115652 - 115662