Frame-wise dynamic threshold based polyphonic acoustic event detection

被引:8
|
作者
Xia, Xianjun [1 ]
Togneri, Roberto [1 ]
Sohel, Ferdous [2 ]
Huang, David [1 ]
机构
[1] Univ Western Australia, Sch Elect Elect & Comp Engn, Nedlands, WA, Australia
[2] Murdoch Univ, Sch Engn & Informat Technol, Murdoch, WA, Australia
关键词
acoustic event detection; multi-label classification; dynamic threshold; NEURAL-NETWORKS;
D O I
10.21437/Interspeech.2017-746
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Acoustic event detection, the determination of the acoustic event type and the localisation of the event, has been widely applied in many real-world applications. Many works adopt multi-label classification techniques to perform the polyphonic acoustic event detection with a global threshold to detect the active acoustic events. However, the global threshold has to be set manually and is highly dependent on the database being tested. To deal with this, we replaced the fixed threshold method with a frame-wise dynamic threshold approach in this paper. Two novel approaches, namely contour and regressor based dynamic threshold approaches are proposed in this work. Experimental results on the popular TUT Acoustic Scenes 2016 database of polyphonic events demonstrated the superior performance of the proposed approaches.
引用
收藏
页码:474 / 478
页数:5
相关论文
共 50 条
  • [31] An Investigation of Fundamental Frequency Pattern Prediction for Japanese Electrolaryngeal Speech Enhancement Based on Frame-Wise Phoneme Representations
    Eshghi, Mohammad
    Toda, Tomoki
    IEEE ACCESS, 2024, 12 : 50137 - 50153
  • [32] Bagged Tree Based Frame-Wise Beforehand Prediction Approach for HEVC Intra-Coding Unit Partitioning
    Li, Yixiao
    Li, Lixiang
    Fang, Yuan
    Peng, Haipeng
    Yang, Yixian
    ELECTRONICS, 2020, 9 (09) : 1 - 28
  • [33] ALL FOR ONE: FRAME-WISE RANK LOSS FOR IMPROVING VIDEO-BASED PERSON RE-IDENTIFICATION
    Navaneet, K. L.
    Todi, Vasudha
    Babu, R. Venkatesh
    Chakraborty, Anirban
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2472 - 2476
  • [34] Automatic Chord Estimation Based on a Frame-wise Convolutional Recurrent Neural Network with Non-Aligned Annotations
    Wu, Yiming
    Carsault, Tristan
    Yoshii, Kazuyoshi
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [35] Multi Frame Size Feature Extraction for Acoustic Event Detection
    Peng, Liqun
    Yang, Deshun
    Chen, Xiaoou
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [36] Fully Convolutional Dense Net based polyphonic sound event detection
    Zhe, He
    Ying, Li
    2018 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, BIG DATA AND BLOCKCHAIN (ICCBB 2018), 2018, : 191 - 196
  • [37] Iterative Order Recursive Least Square Estimation for Exploiting Frame-Wise Sparsity in Compressive Sensing-Based MTC
    Abebe, Ameha T.
    Kang, Chung G.
    IEEE COMMUNICATIONS LETTERS, 2016, 20 (05) : 1018 - 1021
  • [38] Robust tracking based on H-CNN with low-resource sampling and scaling by frame-wise motion localization
    Peng Zhang
    Tao Zhuo
    Hanqiao Huang
    Kangli Chen
    Bo Zhang
    Mohan Kankanhalli
    Multimedia Tools and Applications, 2018, 77 : 18781 - 18800
  • [39] Robust tracking based on H-CNN with low-resource sampling and scaling by frame-wise motion localization
    Zhang, Peng
    Zhuo, Tao
    Huang, Hanqiao
    Chen, Kangli
    Zhang, Bo
    Kankanhalli, Mohan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (14) : 18781 - 18800
  • [40] CONFIDENCE BASED ACOUSTIC EVENT DETECTION
    Xia, Xianjun
    Togneri, Roberto
    Sohel, Ferdous
    Huang, David
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 306 - 310