Instance-level loss based multiple-instance learning framework for acoustic scene classification

被引:3
|
作者
Choi, Won-Gook [1 ]
Chang, Joon-Hyuk [1 ]
Yang, Jae-Mo [2 ]
Moon, Han-Gil [2 ]
机构
[1] Hanyang Univ, Dept Elect Engn, Seoul, South Korea
[2] Samsung Elect, Pyeongtaek 16677, Gyeonggi Do, South Korea
关键词
Acoustic scene classification; Multiple-instance learning; Weakly supervised learning;
D O I
10.1016/j.apacoust.2023.109757
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
An acoustic scene is inferred by detecting properties combining diverse sounds and acoustic environments. This study is intended to discover these properties effectively using multiple -instance learning (MIL). MIL, also known as a weakly supervised learning approach, is a strategy for extracting an instance vector from an audio chunk that composes an audio clip and utilizing these unlabeled instances to infer a scene corresponding to the input data. However, many studies pointed out an underestimation problem of MIL. In this study, we propose an enhanced MIL framework more suitable for ASC systems by defining instance-level labels and loss to extract and cluster instances effectively. Furthermore, we design a lightweight convolutional neural network named FUSE comprising frequency-, temporal-sided depthwise, and pointwise convolutional filters. Experimental results show that the confidence and proportion of positive instances significantly increase compared to vanilla MIL, overcoming the underestimation problem and improving the classification accuracy even higher than the supervised learning. The proposed system achieved a performance of 81.1%, 72.3%, and 58.3% on the TAU urban acoustic scenes 2019, 2020 mobile, and 2022 mobile datasets with 139 K parameters, respectively. In particular, it achieves the highest performance among the systems having under the 1 M parameters for the TAU urban acoustic scenes 2019 dataset.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] An Iterative Instance Selection Based Framework for Multiple-Instance Learning
    Yuan, Liming
    Wen, Xianbin
    Zhao, Lu
    Xu, Haixia
    [J]. 2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 772 - 779
  • [2] A Similarity-Based Classification Framework For Multiple-Instance Learning
    Xiao, Yanshan
    Liu, Bo
    Hao, Zhifeng
    Cao, Longbing
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (04) : 500 - 515
  • [3] A framework for multiple-instance learning
    Maron, O
    Lozano-Perez, T
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 570 - 576
  • [4] Bag-Level Aggregation for Multiple-Instance Active Learning in Instance Classification Problems
    Carbonneau, Marc-Andre
    Granger, Eric
    Gagnon, Ghyslain
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1441 - 1451
  • [5] Discriminative and Consistent Similarities in Instance-Level Multiple Instance Learning
    Rastegari, Mohammad
    Hajishirzi, Hannaneh
    Farhadi, Ali
    [J]. 2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 740 - 748
  • [6] UNSUPERVISED MULTIPLE-INSTANCE LEARNING FOR INSTANCE SEARCH
    Wang, Zhenzhen
    Yuan, Junsong
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [7] MULTIPLE-INSTANCE LEARNING WITH PAIRWISE INSTANCE SIMILARITY
    Yuan, Liming
    Liu, Jiafeng
    Tang, Xianglong
    [J]. INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2014, 24 (03) : 567 - 577
  • [8] Salient Instance Selection for Multiple-Instance Learning
    Yuan, Liming
    Liu, Songbo
    Huang, Qingcheng
    Liu, Jiafeng
    Tang, Xianglong
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2012, PT III, 2012, 7665 : 58 - 67
  • [9] Breast Ultrasound Image Classification Based on Multiple-Instance Learning
    Jianrui Ding
    H. D. Cheng
    Jianhua Huang
    Jiafeng Liu
    Yingtao Zhang
    [J]. Journal of Digital Imaging, 2012, 25 : 620 - 627
  • [10] Multiple-instance learning-based sonar image classification
    Cobb, J. Tory
    Du, Xiaoxiao
    Zare, Alina
    Emigh, Matthew
    [J]. DETECTION AND SENSING OF MINES, EXPLOSIVE OBJECTS, AND OBSCURED TARGETS XXII, 2017, 10182