On Learning Disentangled Representation for Acoustic Event Detection

被引：3

作者：

Gao, Lijian ^{[1
]}

Mao, Qirong ^{[1
]}

Dong, Ming ^{[2
]}

Jing, Yu ^{[2
]}

Chinnam, Ratna ^{[3
]}

机构：

[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang 212013, Jiangsu, Peoples R China

[2] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA

[3] Wayne State Univ, Dept Ind & Syst Engn, Detroit, MI 48202 USA

来源：

PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19) | 2019年

基金：

美国国家科学基金会; 中国国家自然科学基金;

关键词：

acoustic event detection; disentangled latent representation; supervised variational autoencoder;

D O I：

10.1145/3343031.3351086

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Polyphonic Acoustic Event Detection (AED) is a challenging task as the sounds are mixed with the signals from different events, and the features extracted from the mixture do not match well with features calculated from sounds in isolation, leading to suboptimal AED performance. In this paper, we propose a supervised beta-VAE model for AED, which adds a novel event-specific disentangling loss in the objective function of disentangled learning. By incorporating either latent factor blocks or latent attention in disentangling, supervised beta-VAE learns a set of discriminative features for each event. Extensive experiments on benchmark datasets show that our approach outperforms the current state-of-the-arts (top-1 performers in the Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 AED challenge). Supervised beta-VAE has great success in challenging AED tasks with a large variety of events and imbalanced data.

引用

页码：2006 / 2014

页数：9

共 50 条

[1] Reproducibility Companion Paper: On Learning Disentangled Representation for Acoustic Event Detection
Gao, Lijian
Mao, Qirong
Chen, Jingjing
Dong, Ming
Chinnam, Ratna
Sassatelli, Lucile
Rondon, Miguel Romero
Sharma, Ujjwal
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3638 - 3641
[2] Rare Event Detection using Disentangled Representation Learning
Hamaguchi, Ryuhei
Sakurada, Ken
Nakamura, Ryosuke
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9319 - 9327
[3] Disentangled Representation Learning
Wang, Xin
Chen, Hong
Tang, Si'ao
Wu, Zihao
Zhu, Wenwu
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 9677 - 9696
[4] DISENTANGLED GRAPH REPRESENTATION WITH CONTRASTIVE LEARNING FOR RUMOR DETECTION
Liu, Haoyu
Xue, Yuanhai
Yu, Xiaoming
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6470 - 6474
[5] Federated disentangled representation learning for unsupervised brain anomaly detection
Cosmin I. Bercea
Benedikt Wiestler
Daniel Rueckert
Shadi Albarqouni
Nature Machine Intelligence, 2022, 4 : 685 - 695
[6] Federated disentangled representation learning for unsupervised brain anomaly detection
Bercea, Cosmin, I
Wiestler, Benedikt
Rueckert, Daniel
Albarqouni, Shadi
NATURE MACHINE INTELLIGENCE, 2022, 4 (08) : 685 - +
[7] A Review of Disentangled Representation Learning
Wen Z.-D.
Wang J.-R.
Wang X.-X.
Pan Q.
Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (02): : 351 - 374
[8] Disentangled Representation Learning for Multimedia
Wang, Xin
Chen, Hong
Zhu, Wenwu
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9702 - 9704
[9] Disentangled Representation Learning for Recommendation
Wang, Xin
Chen, Hong
Zhou, Yuwei
Ma, Jianxin
Zhu, Wenwu
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 408 - 424
[10] Temporally Disentangled Representation Learning
Yao, Weiran
Chen, Guangyi
Zhang, Kun
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,

← 1 2 3 4 5 →