RCT: Random Consistency Training for Semi-Supervised Sound Event Detection

被引:1
|
作者
Shao, Nian [1 ,2 ]
Loweimi, Erfan [3 ]
Li, Xiaofei [1 ,2 ]
机构
[1] Westlake Univ, Hangzhou, Peoples R China
[2] Westlake Inst Adv Study, Hangzhou, Peoples R China
[3] Univ Edinburgh, CSTR, Edinburgh, Midlothian, Scotland
来源
关键词
semi-supervised learning; sound event detection; data augmentation; consistency regularization; hard mixup;
D O I
10.21437/Interspeech.2022-10037
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Sound event detection (SED), as a core module of acoustic environmental analysis, suffers from the problem of data deficiency. The integration of semi-supervised learning (SSL) largely mitigates such problem. This paper researches on several core modules of SSL, and introduces a random consistency training (RCT) strategy. First, a hard mixup data augmentation is proposed to account for the additive property of sounds. Second, a random augmentation scheme is applied to stochastically combine different types of data augmentation methods with high flexibility. Third, a self-consistency loss is proposed to be fused with the teacher-student model, aiming at stabilizing the training. Performance-wise, the proposed modules outperform their respective competitors, and as a whole the proposed SED strategies achieve 44.0% and 67.1% in terms of the PSDS1 and PSDS2 metrics proposed by the DCASE challenge, which notably outperforms other widely-used alternatives.
引用
收藏
页码:1541 / 1545
页数:5
相关论文
共 50 条
  • [1] Resolution Consistency Training on Time-Frequency Domain for Semi-Supervised Sound Event Detection
    Choi, Won-Gook
    Chang, Joon-Hyuk
    [J]. INTERSPEECH 2023, 2023, : 286 - 290
  • [2] Semi-Supervised Sound Event Detection Using Self-Attention and Multiple Techniques of Consistency Training
    Wang, Yih-Wen
    Chen, Chia-Ping
    Lu, Chung-Li
    Chan, Bo-Cheng
    [J]. 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 269 - 274
  • [3] COUPLE LEARNING FOR SEMI-SUPERVISED SOUND EVENT DETECTION
    Tao, Rui
    Yan, Long
    Ouchi, Kazushige
    Wang, Xiangdong
    [J]. INTERSPEECH 2022, 2022, : 2398 - 2402
  • [4] Semi-Supervised NMF-CNN for Sound Event Detection
    Chan, Teck Kai
    Chin, Cheng Siong
    Li, Ye
    [J]. IEEE ACCESS, 2021, 9 : 130529 - 130542
  • [5] On Local Temporal Embedding for Semi-Supervised Sound Event Detection
    Gao, Lijian
    Mao, Qirong
    Dong, Ming
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1687 - 1698
  • [6] Interpolation Consistency Training for Semi-Supervised Learning
    Verma, Vikas
    Lamb, Alex
    Kannala, Juho
    Bengio, Yoshua
    Lopez-Paz, David
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3635 - 3641
  • [7] Interpolation consistency training for semi-supervised learning
    Verma, Vikas
    Kawaguchi, Kenji
    Lamb, Alex
    Kannala, Juho
    Solin, Arno
    Bengio, Yoshua
    Lopez-Paz, David
    [J]. NEURAL NETWORKS, 2022, 145 : 90 - 106
  • [8] Regression-based Sound Event Detection with Semi-supervised Learning
    Liu, Chia-Chuan
    Chen, Chia-Ping
    Lu, Chung-Li
    Chan, Bo-cheng
    Cheng, Yu-Han
    Chuang, Hsiang-Feng
    Chen, Wei-Yu
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2336 - 2342
  • [9] SPARSE SELF-ATTENTION FOR SEMI-SUPERVISED SOUND EVENT DETECTION
    Guan, Yadong
    Xue, Jiabin
    Zheng, Guibin
    Han, Jiqing
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 821 - 825
  • [10] PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection
    Li, Gang
    Li, Xiang
    Wang, Yujie
    Wu, Yichao
    Liang, Ding
    Zhang, Shanshan
    [J]. COMPUTER VISION, ECCV 2022, PT IX, 2022, 13669 : 457 - 472