Frame-wise dynamic threshold based polyphonic acoustic event detection

被引:8
|
作者
Xia, Xianjun [1 ]
Togneri, Roberto [1 ]
Sohel, Ferdous [2 ]
Huang, David [1 ]
机构
[1] Univ Western Australia, Sch Elect Elect & Comp Engn, Nedlands, WA, Australia
[2] Murdoch Univ, Sch Engn & Informat Technol, Murdoch, WA, Australia
关键词
acoustic event detection; multi-label classification; dynamic threshold; NEURAL-NETWORKS;
D O I
10.21437/Interspeech.2017-746
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Acoustic event detection, the determination of the acoustic event type and the localisation of the event, has been widely applied in many real-world applications. Many works adopt multi-label classification techniques to perform the polyphonic acoustic event detection with a global threshold to detect the active acoustic events. However, the global threshold has to be set manually and is highly dependent on the database being tested. To deal with this, we replaced the fixed threshold method with a frame-wise dynamic threshold approach in this paper. Two novel approaches, namely contour and regressor based dynamic threshold approaches are proposed in this work. Experimental results on the popular TUT Acoustic Scenes 2016 database of polyphonic events demonstrated the superior performance of the proposed approaches.
引用
收藏
页码:474 / 478
页数:5
相关论文
共 50 条
  • [11] Frame-wise detection of surgeon stress levels during laparoscopic training using kinematic data
    Yi Zheng
    Grey Leonard
    Herbert Zeh
    Ann Majewicz Fey
    International Journal of Computer Assisted Radiology and Surgery, 2022, 17 : 785 - 794
  • [12] FRAME-WISE CNN-BASED VIEW SYNTHESIS FOR LIGHT FIELD CAMERA ARRAYS
    Schiopu, Ionut
    Alface, Patrice Rondao
    Munteanu, Adrian
    2019 INTERNATIONAL CONFERENCE ON 3D IMMERSION (IC3D), 2019,
  • [13] Polyphonic sound event localization and detection using channel-wise FusionNet
    Spoorthy, V.
    Kooolagudi, Shashidhar G.
    APPLIED INTELLIGENCE, 2024, 54 (06) : 5015 - 5026
  • [14] Frame-wise detection of surgeon stress levels during laparoscopic training using kinematic data
    Zheng, Yi
    Leonard, Grey
    Zeh, Herbert
    Fey, Ann Majewicz
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2022, 17 (04) : 785 - 794
  • [15] Video Super-Resolution with Frame-Wise Dynamic Fusion and Self-Calibrated Deformable Alignment
    Xu, Wenjie
    Song, Huihui
    Jin, Yutong
    Yan, Fei
    NEURAL PROCESSING LETTERS, 2022, 54 (04) : 2803 - 2815
  • [16] Higher-Order Nonlinear Analysis with Core Tensor and Frame-Wise Approach for Dynamic Texture Synthesis
    Ghadekar, Premanand P.
    Chopade, Nilkanth B.
    PROGRESS IN ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, PROCEEDINGS OF ICACIE 2016, VOLUME 1, 2018, 563 : 15 - 25
  • [17] Voice Activity Detection Using Frame-Wise Model Re-Estimation Method Based on Gaussian Pruning with Weight Normalization
    Fujimoto, Masakiyo
    Watanabe, Shinji
    Nakatani, Tomohiro
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 3102 - 3105
  • [18] Video Super-Resolution with Frame-Wise Dynamic Fusion and Self-Calibrated Deformable Alignment
    Wenjie Xu
    Huihui Song
    Yutong Jin
    Fei Yan
    Neural Processing Letters, 2022, 54 : 2803 - 2815
  • [19] Low frequency frame-wise normalization over constant-Q transform for playback speech detection
    Yang, Jichen
    Das, Rohan Kumar
    DIGITAL SIGNAL PROCESSING, 2019, 89 : 30 - 39
  • [20] No need for frame-wise attenuation correction in dynamic Rubidium-82 PET for myocardial blood flow quantification
    J. D. van Dijk
    P. L. Jager
    J. P. Ottervanger
    C. H. Slump
    J. A. van Dalen
    Journal of Nuclear Cardiology, 2019, 26 : 738 - 745