Comparative Assessment of Data Augmentation for Semi-Supervised Polyphonic Sound Event Detection

被引:5
|
作者
Delphin-Poulat, Lionel [1 ]
Nicol, Rozenn [1 ]
Plapous, Cyril [1 ]
Peron, Katell [1 ]
机构
[1] IAM, Orange Labs, Cesson Sevigne, France
关键词
CONVOLUTIONAL NEURAL-NETWORKS;
D O I
10.23919/fruct49677.2020.9211023
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the context of audio ambient intelligence systems in Smart Buildings, polyphonic Sound Event Detection aims at detecting, localizing and classifying any sound event recorded in a room. Today, most of models are based on Deep Learning, requiring large databases to be trained. We propose a CRNN system exploiting unlabeled data with semi-supervised learning based on the "Mean teacher" method, in combination with data augmentation to overcome the limited size of the training dataset and to further improve the performances. This model was submitted to the challenge DCASE 2019 and was ranked second out of 58 systems submitted. In the present study, several conventional solutions of data augmentation are compared: time or frequency shifting, and background noise addition. It is shown that data augmentation with time shifting and noise addition, in combination with class-dependent median filtering, improves the performance by 9%, leading to an event-based F1-score of 43.2% with DCASE 2019 validation set. However, these tools rely on a coarse modelling (i.e. random variation of data) of intra-class variability observed in real life. Injecting acoustic knowledge into the design of augmentation methods seems to be a promising way forward, leading us to propose strategies of physics-inspired modelling for future work.
引用
收藏
页码:46 / 53
页数:8
相关论文
共 50 条
  • [1] COUPLE LEARNING FOR SEMI-SUPERVISED SOUND EVENT DETECTION
    Tao, Rui
    Yan, Long
    Ouchi, Kazushige
    Wang, Xiangdong
    [J]. INTERSPEECH 2022, 2022, : 2398 - 2402
  • [2] Polyphonic Sound Event Detection Based on Residual Convolutional Recurrent Neural Network With Semi-Supervised Loss Function
    Kim, Nam Kyun
    Kim, Hong Kook
    [J]. IEEE ACCESS, 2021, 9 : 7564 - 7575
  • [3] Semi-Supervised NMF-CNN for Sound Event Detection
    Chan, Teck Kai
    Chin, Cheng Siong
    Li, Ye
    [J]. IEEE ACCESS, 2021, 9 : 130529 - 130542
  • [4] On Local Temporal Embedding for Semi-Supervised Sound Event Detection
    Gao, Lijian
    Mao, Qirong
    Dong, Ming
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1687 - 1698
  • [5] SPARSE SELF-ATTENTION FOR SEMI-SUPERVISED SOUND EVENT DETECTION
    Guan, Yadong
    Xue, Jiabin
    Zheng, Guibin
    Han, Jiqing
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 821 - 825
  • [6] Regression-based Sound Event Detection with Semi-supervised Learning
    Liu, Chia-Chuan
    Chen, Chia-Ping
    Lu, Chung-Li
    Chan, Bo-cheng
    Cheng, Yu-Han
    Chuang, Hsiang-Feng
    Chen, Wei-Yu
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2336 - 2342
  • [7] RCT: Random Consistency Training for Semi-Supervised Sound Event Detection
    Shao, Nian
    Loweimi, Erfan
    Li, Xiaofei
    [J]. INTERSPEECH 2022, 2022, : 1541 - 1545
  • [8] Semi-Supervised Learning with Data Augmentation for Tabular Data
    Fang, Junpeng
    Tang, Caizhi
    Cui, Qing
    Zhu, Feng
    Li, Longfei
    Zhou, Jun
    Zhu, Wei
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 3928 - 3932
  • [9] An Effective Perturbation based Semi-Supervised Learning Method for Sound Event Detection
    Zheng, Xu
    Song, Yan
    Yan, Jie
    Dai, Li-Rong
    McLoughlin, Ian
    Liu, Lin
    [J]. INTERSPEECH 2020, 2020, : 841 - 845
  • [10] GUIDED LEARNING FOR WEAKLY-LABELED SEMI-SUPERVISED SOUND EVENT DETECTION
    Lin, Liwei
    Wang, Xiangdong
    Liu, Hong
    Qian, Yueliang
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 626 - 630