On Learning Disentangled Representation for Acoustic Event Detection

被引:3
|
作者
Gao, Lijian [1 ]
Mao, Qirong [1 ]
Dong, Ming [2 ]
Jing, Yu [2 ]
Chinnam, Ratna [3 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang 212013, Jiangsu, Peoples R China
[2] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA
[3] Wayne State Univ, Dept Ind & Syst Engn, Detroit, MI 48202 USA
来源
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19) | 2019年
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
acoustic event detection; disentangled latent representation; supervised variational autoencoder;
D O I
10.1145/3343031.3351086
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Polyphonic Acoustic Event Detection (AED) is a challenging task as the sounds are mixed with the signals from different events, and the features extracted from the mixture do not match well with features calculated from sounds in isolation, leading to suboptimal AED performance. In this paper, we propose a supervised beta-VAE model for AED, which adds a novel event-specific disentangling loss in the objective function of disentangled learning. By incorporating either latent factor blocks or latent attention in disentangling, supervised beta-VAE learns a set of discriminative features for each event. Extensive experiments on benchmark datasets show that our approach outperforms the current state-of-the-arts (top-1 performers in the Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 AED challenge). Supervised beta-VAE has great success in challenging AED tasks with a large variety of events and imbalanced data.
引用
收藏
页码:2006 / 2014
页数:9
相关论文
共 50 条
  • [1] Reproducibility Companion Paper: On Learning Disentangled Representation for Acoustic Event Detection
    Gao, Lijian
    Mao, Qirong
    Chen, Jingjing
    Dong, Ming
    Chinnam, Ratna
    Sassatelli, Lucile
    Rondon, Miguel Romero
    Sharma, Ujjwal
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3638 - 3641
  • [2] Rare Event Detection using Disentangled Representation Learning
    Hamaguchi, Ryuhei
    Sakurada, Ken
    Nakamura, Ryosuke
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9319 - 9327
  • [3] Disentangled Representation Learning
    Wang, Xin
    Chen, Hong
    Tang, Si'ao
    Wu, Zihao
    Zhu, Wenwu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 9677 - 9696
  • [4] DISENTANGLED GRAPH REPRESENTATION WITH CONTRASTIVE LEARNING FOR RUMOR DETECTION
    Liu, Haoyu
    Xue, Yuanhai
    Yu, Xiaoming
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6470 - 6474
  • [5] Federated disentangled representation learning for unsupervised brain anomaly detection
    Cosmin I. Bercea
    Benedikt Wiestler
    Daniel Rueckert
    Shadi Albarqouni
    Nature Machine Intelligence, 2022, 4 : 685 - 695
  • [6] Federated disentangled representation learning for unsupervised brain anomaly detection
    Bercea, Cosmin, I
    Wiestler, Benedikt
    Rueckert, Daniel
    Albarqouni, Shadi
    NATURE MACHINE INTELLIGENCE, 2022, 4 (08) : 685 - +
  • [7] A Review of Disentangled Representation Learning
    Wen Z.-D.
    Wang J.-R.
    Wang X.-X.
    Pan Q.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (02): : 351 - 374
  • [8] Disentangled Representation Learning for Multimedia
    Wang, Xin
    Chen, Hong
    Zhu, Wenwu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9702 - 9704
  • [9] Disentangled Representation Learning for Recommendation
    Wang, Xin
    Chen, Hong
    Zhou, Yuwei
    Ma, Jianxin
    Zhu, Wenwu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 408 - 424
  • [10] Temporally Disentangled Representation Learning
    Yao, Weiran
    Chen, Guangyi
    Zhang, Kun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,