Instance-level loss based multiple-instance learning framework for acoustic scene classification

被引:3
|
作者
Choi, Won-Gook [1 ]
Chang, Joon-Hyuk [1 ]
Yang, Jae-Mo [2 ]
Moon, Han-Gil [2 ]
机构
[1] Hanyang Univ, Dept Elect Engn, Seoul, South Korea
[2] Samsung Elect, Pyeongtaek 16677, Gyeonggi Do, South Korea
关键词
Acoustic scene classification; Multiple-instance learning; Weakly supervised learning;
D O I
10.1016/j.apacoust.2023.109757
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
An acoustic scene is inferred by detecting properties combining diverse sounds and acoustic environments. This study is intended to discover these properties effectively using multiple -instance learning (MIL). MIL, also known as a weakly supervised learning approach, is a strategy for extracting an instance vector from an audio chunk that composes an audio clip and utilizing these unlabeled instances to infer a scene corresponding to the input data. However, many studies pointed out an underestimation problem of MIL. In this study, we propose an enhanced MIL framework more suitable for ASC systems by defining instance-level labels and loss to extract and cluster instances effectively. Furthermore, we design a lightweight convolutional neural network named FUSE comprising frequency-, temporal-sided depthwise, and pointwise convolutional filters. Experimental results show that the confidence and proportion of positive instances significantly increase compared to vanilla MIL, overcoming the underestimation problem and improving the classification accuracy even higher than the supervised learning. The proposed system achieved a performance of 81.1%, 72.3%, and 58.3% on the TAU urban acoustic scenes 2019, 2020 mobile, and 2022 mobile datasets with 139 K parameters, respectively. In particular, it achieves the highest performance among the systems having under the 1 M parameters for the TAU urban acoustic scenes 2019 dataset.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Research on Multiple-Instance Learning for Tongue Coating Classification
    Tang, Yonghui
    Sun, Yue
    Chiang, John Y.
    Li, Xiaoqiang
    [J]. IEEE ACCESS, 2021, 9 : 66361 - 66370
  • [22] Research on Multiple-Instance Learning for Tongue Coating Classification
    Tang, Yonghui
    Sun, Yue
    Chiang, John Y.
    Li, Xiaoqiang
    [J]. IEEE Access, 2021, 9 : 66361 - 66370
  • [23] Pairwise-similarity-based instance reduction for efficient instance selection in multiple-instance learning
    Liming Yuan
    Jiafeng Liu
    Xianglong Tang
    Daming Shi
    Lu Zhao
    [J]. International Journal of Machine Learning and Cybernetics, 2015, 6 : 83 - 93
  • [24] Tongue Coating Classification Based on Multiple-Instance Learning and Deep Features
    Li, Xiaoqiang
    Tang, Yonghui
    Sun, Yue
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2019), PT IV, 2019, 1142 : 504 - 511
  • [25] Pairwise-similarity-based instance reduction for efficient instance selection in multiple-instance learning
    Yuan, Liming
    Liu, Jiafeng
    Tang, Xianglong
    Shi, Daming
    Zhao, Lu
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2015, 6 (01) : 83 - 93
  • [26] A Multiple-Instance Densely-Connected ConvNet for Aerial Scene Classification
    Bi, Qi
    Qin, Kun
    Li, Zhili
    Zhang, Han
    Xu, Kai
    Xia, Gui-Song
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 4911 - 4926
  • [27] Multiple-instance learning via multiple-point concept based instance selection
    Liming Yuan
    Guangping Xu
    Lu Zhao
    Xianbin Wen
    Haixia Xu
    [J]. International Journal of Machine Learning and Cybernetics, 2020, 11 : 2113 - 2126
  • [28] Multiple-instance learning via multiple-point concept based instance selection
    Yuan, Liming
    Xu, Guangping
    Zhao, Lu
    Wen, Xianbin
    Xu, Haixia
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (09) : 2113 - 2126
  • [29] Multiple-Instance Multiple-Label Learning for the Classification of Frog Calls with Acoustic Event Detection
    Xie, Jie
    Towsey, Michael
    Zhang, Liang
    Yasumiba, Kiyomi
    Schwarzkopf, Lin
    Zhang, Jinglan
    Roe, Paul
    [J]. IMAGE AND SIGNAL PROCESSING (ICISP 2016), 2016, 9680 : 222 - 230
  • [30] Instance-Level Label Propagation with Multi-Instance Learning
    Wang, Qifan
    Chechik, Gal
    Sun, Chen
    Shen, Bin
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2943 - 2949