Frame-wise dynamic threshold based polyphonic acoustic event detection

被引：8

作者：

Xia, Xianjun ^{[1
]}

Togneri, Roberto ^{[1
]}

Sohel, Ferdous ^{[2
]}

Huang, David ^{[1
]}

机构：

[1] Univ Western Australia, Sch Elect Elect & Comp Engn, Nedlands, WA, Australia

[2] Murdoch Univ, Sch Engn & Informat Technol, Murdoch, WA, Australia

来源：

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年

关键词：

acoustic event detection; multi-label classification; dynamic threshold; NEURAL-NETWORKS;

D O I：

10.21437/Interspeech.2017-746

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Acoustic event detection, the determination of the acoustic event type and the localisation of the event, has been widely applied in many real-world applications. Many works adopt multi-label classification techniques to perform the polyphonic acoustic event detection with a global threshold to detect the active acoustic events. However, the global threshold has to be set manually and is highly dependent on the database being tested. To deal with this, we replaced the fixed threshold method with a frame-wise dynamic threshold approach in this paper. Two novel approaches, namely contour and regressor based dynamic threshold approaches are proposed in this work. Experimental results on the popular TUT Acoustic Scenes 2016 database of polyphonic events demonstrated the superior performance of the proposed approaches.

引用

页码：474 / 478

页数：5

共 50 条

[11] Frame-wise detection of surgeon stress levels during laparoscopic training using kinematic data
Yi Zheng
Grey Leonard
Herbert Zeh
Ann Majewicz Fey
International Journal of Computer Assisted Radiology and Surgery, 2022, 17 : 785 - 794
[12] FRAME-WISE CNN-BASED VIEW SYNTHESIS FOR LIGHT FIELD CAMERA ARRAYS
Schiopu, Ionut
Alface, Patrice Rondao
Munteanu, Adrian
2019 INTERNATIONAL CONFERENCE ON 3D IMMERSION (IC3D), 2019,
[13] Polyphonic sound event localization and detection using channel-wise FusionNet
Spoorthy, V.
Kooolagudi, Shashidhar G.
APPLIED INTELLIGENCE, 2024, 54 (06) : 5015 - 5026
[14] Frame-wise detection of surgeon stress levels during laparoscopic training using kinematic data
Zheng, Yi
Leonard, Grey
Zeh, Herbert
Fey, Ann Majewicz
INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2022, 17 (04) : 785 - 794
[15] Video Super-Resolution with Frame-Wise Dynamic Fusion and Self-Calibrated Deformable Alignment
Xu, Wenjie
Song, Huihui
Jin, Yutong
Yan, Fei
NEURAL PROCESSING LETTERS, 2022, 54 (04) : 2803 - 2815
[16] Higher-Order Nonlinear Analysis with Core Tensor and Frame-Wise Approach for Dynamic Texture Synthesis
Ghadekar, Premanand P.
Chopade, Nilkanth B.
PROGRESS IN ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, PROCEEDINGS OF ICACIE 2016, VOLUME 1, 2018, 563 : 15 - 25
[17] Voice Activity Detection Using Frame-Wise Model Re-Estimation Method Based on Gaussian Pruning with Weight Normalization
Fujimoto, Masakiyo
Watanabe, Shinji
Nakatani, Tomohiro
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 3102 - 3105
[18] Video Super-Resolution with Frame-Wise Dynamic Fusion and Self-Calibrated Deformable Alignment
Wenjie Xu
Huihui Song
Yutong Jin
Fei Yan
Neural Processing Letters, 2022, 54 : 2803 - 2815
[19] Low frequency frame-wise normalization over constant-Q transform for playback speech detection
Yang, Jichen
Das, Rohan Kumar
DIGITAL SIGNAL PROCESSING, 2019, 89 : 30 - 39
[20] No need for frame-wise attenuation correction in dynamic Rubidium-82 PET for myocardial blood flow quantification
J. D. van Dijk
P. L. Jager
J. P. Ottervanger
C. H. Slump
J. A. van Dalen
Journal of Nuclear Cardiology, 2019, 26 : 738 - 745

← 1 2 3 4 5 →