Augmented Strategy For Polyphonic Sound Event Detection

被引:0
|
作者
Wang, Bolun [1 ]
Fu, Zhong-Hua [1 ,2 ]
Wu, Hao [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian, Peoples R China
[2] Xian IFLYTEK Hyper Brain Informat Technol Co Ltd, Xian, Peoples R China
关键词
Sound event detection; Data augmentation; Model fusion; ACOUSTIC SCENES; CLASSIFICATION;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Sound event detection is an important issue for many applications like audio content retrieval, intelligent monitoring, and scene-based interaction. The traditional studies on this topic are mainly focusing on identification of single sound event class. However, in real applications, several sound events usually happen concurrently and with different durations. That leads to a new detection task on polyphonic sound event classification along with event time boundaries. In this paper, we propose an augmented strategy for this task, which faces challenges of a large amount of unbalanced and weakly labelled training data. Specifically, the strategy includes data augmentation to enrich training set to eliminate data unbalance, a new loss function that combines cross entropy and F-score, and model fusion to integrate the powers of different classifiers. The performance of the strategy is validated on DCASE2019 dataset, and both the event and segment detections are significantly improved over the baseline system.
引用
收藏
页码:1496 / 1500
页数:5
相关论文
共 50 条
  • [1] Metrics for Polyphonic Sound Event Detection
    Mesaros, Annamaria
    Heittola, Toni
    Virtanen, Tuomas
    APPLIED SCIENCES-BASEL, 2016, 6 (06):
  • [2] Event Specific Attention for Polyphonic Sound Event Detection
    Sundar, Harshavardhan
    Sun, Ming
    Wang, Chao
    INTERSPEECH 2021, 2021, : 566 - 570
  • [3] A Comprehensive Review of Polyphonic Sound Event Detection
    Chan, T. K.
    Chin, Cheng Siong
    IEEE ACCESS, 2020, 8 : 103339 - 103373
  • [4] A Capsule based Approach for Polyphonic Sound Event Detection
    Liu, Yaming
    Tang, Jian
    Song, Yan
    Dai, Lirong
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1853 - 1857
  • [5] A survey of Deep Learning for Polyphonic Sound event detection
    Dang, An
    Vu, Toan H.
    Wang, Jia-Ching
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON ORANGE TECHNOLOGIES (ICOT), 2017, : 75 - 78
  • [6] USING SEQUENTIAL INFORMATION IN POLYPHONIC SOUND EVENT DETECTION
    Huang, Guangpu
    Heittola, Toni
    Virtanen, Tuomas
    2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 291 - 295
  • [7] PEER COLLABORATIVE LEARNING FOR POLYPHONIC SOUND EVENT DETECTION
    Endo, Hayato
    Nishizaki, Hiromitsu
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 826 - 830
  • [8] SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection
    Nguyen, Thi Ngoc Tho
    Watcharasupat, Karn N.
    Nguyen, Ngoc Khanh
    Jones, Douglas L.
    Gan, Woon-Seng
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1749 - 1762
  • [9] POLYPHONIC SOUND EVENT AND SOUND ACTIVITY DETECTION: A MULTI-TASK APPROACH
    Pankajakshan, Arjun
    Bear, Helen L.
    Benetos, Emmanouil
    2019 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2019, : 323 - 327
  • [10] Duration-Controlled LSTM for Polyphonic Sound Event Detection
    Hayashi, Tomoki
    Watanabe, Shinji
    Toda, Tomoki
    Hori, Takaaki
    Le Roux, Jonathan
    Takeda, Kazuya
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (11) : 2059 - 2070