Frame-wise dynamic threshold based polyphonic acoustic event detection

被引:8
|
作者
Xia, Xianjun [1 ]
Togneri, Roberto [1 ]
Sohel, Ferdous [2 ]
Huang, David [1 ]
机构
[1] Univ Western Australia, Sch Elect Elect & Comp Engn, Nedlands, WA, Australia
[2] Murdoch Univ, Sch Engn & Informat Technol, Murdoch, WA, Australia
关键词
acoustic event detection; multi-label classification; dynamic threshold; NEURAL-NETWORKS;
D O I
10.21437/Interspeech.2017-746
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Acoustic event detection, the determination of the acoustic event type and the localisation of the event, has been widely applied in many real-world applications. Many works adopt multi-label classification techniques to perform the polyphonic acoustic event detection with a global threshold to detect the active acoustic events. However, the global threshold has to be set manually and is highly dependent on the database being tested. To deal with this, we replaced the fixed threshold method with a frame-wise dynamic threshold approach in this paper. Two novel approaches, namely contour and regressor based dynamic threshold approaches are proposed in this work. Experimental results on the popular TUT Acoustic Scenes 2016 database of polyphonic events demonstrated the superior performance of the proposed approaches.
引用
收藏
页码:474 / 478
页数:5
相关论文
共 50 条
  • [21] A Capsule based Approach for Polyphonic Sound Event Detection
    Liu, Yaming
    Tang, Jian
    Song, Yan
    Dai, Lirong
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1853 - 1857
  • [22] General Frame-Wise Steganalysis of Compressed Speech Based on Dual-Domain Representation and Intra-Frame Correlation Leaching
    Li, Songbin
    Wang, Jingang
    Liu, Peng
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2025 - 2035
  • [23] Frame-wise Online Unsupervised Adaptation of DNN-HMM Acoustic Model from Perspective of Robust Adaptive Filtering
    Takeda, Ryu
    Komatani, Kazunori
    INTERSPEECH 2020, 2020, : 1291 - 1295
  • [24] Obscured Wildfire Flame Detection by Spatio-temporal Analysis of Smoke Patterns Using Frame-wise Transformers
    Meleti, Uma
    Razi, Abolfazl
    Afghah, Fatemeh
    2024 20TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SMART SYSTEMS AND THE INTERNET OF THINGS, DCOSS-IOT 2024, 2024, : 58 - 65
  • [25] Robust polyphonic sound event detection by using multi frame size denoising autoencoder
    Zhou, Jianchao
    Chen, Xiaoou
    Yang, Deshun
    2018 IEEE 20TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2018,
  • [26] Frame-wise model re-estimation method based on Gaussian pruning with weight normalization for noise robust voice activity detection
    Fujimoto, Masakiyo
    Watanabe, Shinji
    Nakatani, Tomohiro
    SPEECH COMMUNICATION, 2012, 54 (02) : 229 - 244
  • [27] Frame-wise detection of relocated I-frames in double compressed H.264 videos based on convolutional neural network
    He, Peisong
    Jiang, Xinghao
    Sun, Tanfeng
    Wang, Shilin
    Li, Bin
    Dong, Yi
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 48 : 149 - 158
  • [28] Frame-Wise Detection of Double HEVC Compression by Learning Deep Spatio-Temporal Representations in Compression Domain
    He, Peisong
    Li, Haoliang
    Wang, Hongxia
    Wang, Shiqi
    Jiang, Xinghao
    Zhang, Ruimei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 3179 - 3192
  • [29] No Need For Frame-wise Alignment Of CT And Dynamic Rb-82 PET Data For Myocardial Blood Flow Quantification
    van Dijk, J. D.
    Lau, A.
    Jager, P. L.
    Ottervanger, J.
    Slump, C. H.
    van Dalen, A.
    EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2018, 45 : S91 - S91
  • [30] IMPULSIVE TIMING DETECTION BASED ON MULTI-FRAME PHASE VOTING FOR ACOUSTIC EVENT DETECTION
    Mishima, Sakiko
    Kondo, Reishi
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 956 - 960