ACOUSTIC SCENE CLASSIFICATION WITH MISMATCHED RECORDING DEVICES USING MIXTURE OF EXPERTS LAYER

被引:11
|
作者
Truc Nguyen [1 ]
Pernkopf, Franz [1 ]
机构
[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, Inffeldgasse 16c, A-8010 Graz, Austria
来源
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME) | 2019年
基金
奥地利科学基金会;
关键词
Acoustic scene classification; convolutional neural network; mixture of experts layer; mixture of softmaxes;
D O I
10.1109/ICME.2019.00287
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recently, a mismatch in acoustic conditions such as a temporal recording gap as well as different recording devices for the development and the evaluation data has been considered in Acoustic Scene Classification (ASC). This brings ASC closer to real world conditions. In this paper, we address ASC with mismatching recording devices. This has been introduced as task 1B of the DCASE 2018 challenge. We proposed a flexible and robust model that uses a mixture of experts (MoE) layer replacing the fully connected dense layer such that each expert can adapt to the specific domains of the data. Furthermore, we observe different Convolutional Neural Network (CNN) models as well as the number of the experts of the MoE dense layer using log-mel features. In addition, we perform mixup data augmentation to enhance the robustness of our models. In experiments, the classification performance is 66.1% using 15 experts in the MoE dense layer with approximately 2M parameters. This outperforms the best model of task 1B of the DCASE 2018 challenge by 2.5% (absolute). This model uses an ensemble selection of 12 individual models with similar to 12M parameters.
引用
收藏
页码:1666 / 1671
页数:6
相关论文
共 50 条
  • [1] Spectrum Correction: Acoustic Scene Classification with Mismatched Recording Devices
    Kosmider, Michal
    INTERSPEECH 2020, 2020, : 4641 - 4645
  • [2] ACOUSTIC SCENE CLASSIFICATION FOR MISMATCHED RECORDING DEVICES USING HEATED-UP SOFTMAX AND SPECTRUM CORRECTION
    Nguyen, Truc
    Pernkopf, Franz
    Kosmider, Michal
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 126 - 130
  • [3] Wider or Deeper Neural Network Architecture for Acoustic Scene Classification with Mismatched Recording Devices
    Lam Pham
    Khoa Tran
    Dat Ngo
    Hieu Tang
    Son Phan
    Schindler, Alexander
    PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA IN ASIA, MMASIA 2022, 2022,
  • [4] Acoustic Scene Classification with Mismatched Devices Using CliqueNets and Mixup Data Augmentation
    Nguyen, Truc
    Pernkopf, Franz
    INTERSPEECH 2019, 2019, : 2330 - 2334
  • [5] Improving Multimodal Movie Scene Segmentation Using Mixture of Acoustic Experts
    Lin, Meng-Han
    Li, Jeng-Lin
    Lee, Chi-Chun
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 6 - 10
  • [6] Adversarial Domain Adaptation with Paired Examples for Acoustic Scene Classification on Different Recording Devices
    Kacprzak, Stanislaw
    Kowalczyk, Konrad
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1030 - 1034
  • [7] Domain Adaptation Neural Network for Acoustic Scene Classification in Mismatched Conditions
    Wang, Rui
    Wang, Mou
    Zhang, Xiao-Lei
    Rahardja, Susanto
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1501 - 1505
  • [8] Mixture of experts classification using a hierarchical mixture model
    Titsias, MK
    Likas, A
    NEURAL COMPUTATION, 2002, 14 (09) : 2221 - 2244
  • [9] Acoustic Scene Classification Using Spectrograms
    Felipe, Gustavo Zanoni
    da Costa, Yandre Maldonado e Gomes
    Helal, Lucas Georges
    2017 36TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2017,
  • [10] Feature Alignment for Robust Acoustic Scene Classification Across Devices
    Zhao, Jingqiao
    Kong, Qiuqiang
    Song, Xiaoning
    Feng, Zhenhua
    Wu, Xiaojun
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 578 - 582