Semi-supervised classification-aware cross-modal deep adversarial data augmentation

被引：8

作者：

Wang, Shaoqiang ^{[1
]}

Wu, Zhenzhen ^{[2
]}

He, Gewen ^{[3
]}

Wang, Shudong ^{[1
]}

Sun, Hongwei ^{[2
]}

Fan, Fangfang ^{[4
]}

机构：

[1] China Univ Petr, Sch Comp & Commun Engn, Qingdao 266000, Peoples R China

[2] Weifang Univ Sci & Technol, Shandong Prov Univ Lab Protected Hort, Weifang 262700, Peoples R China

[3] Florida State Univ, Dept Comp Sci, Tallahassee, FL 32306 USA

[4] Harvard Univ, Harvard Med Sch, Cambridge, MA 02215 USA

来源：

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2021年 / 125卷

关键词：

Adversarial network; Data augmentation; Density estimation; Graph representation; Semi supervised learning;

D O I：

10.1016/j.future.2021.05.029

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Deep neural networks are usually data-starved in real-world applications, while manually annotation can be costly-for example, the audio emotion recognition from the audio. In contrast, the continued research in image-based facial expression recognition grants us a rich source of public available labeled IFER datasets. Using images to support audio emotion recognition with limited labeled data according to their inherent correlations can be a meaningful and challenging task. This paper proposes a system that facilitates knowledge transfer from the labeled visual to the heterogeneous labeled audio domain by learning a joint distribution of examples in different modalities then the system can map an IFER example to a corresponding audio spectrogram. Next, our work reformulates the audio emotion classification into a K+1 class discriminator of GAN-based semi-supervised learning. Good semi-supervised learning requires that the generator does NOT sample from a distribution well matching the true data distribution. Therefore, we demand the generated examples are from the low-density areas of the marginal distribution in the audio spectrogram modality. Concretely, the proposed model translates image samples to audios class-wisely in the form of spectrograms. To harness the decoded samples in a sparsely distributed area and construct a tighter decision boundary, we give a solution to precisely estimate the density on feature space and incorporate low-density pieces with an annealing scheme. Our method requires the network to discriminate against the low-density data points from high-density data points throughout the classification, and we evidence that this technique effectively improves task performance. (C) 2021 Published by Elsevier B.V.

引用

页码：194 / 205

页数：12

共 50 条

[41] Quantum semi-supervised generative adversarial network for enhanced data classification
Nakaji, Kouhei
Yamamoto, Naoki
SCIENTIFIC REPORTS, 2021, 11 (01)
[42] Quantum semi-supervised generative adversarial network for enhanced data classification
Kouhei Nakaji
Naoki Yamamoto
Scientific Reports, 11
[43] Modulation classification with data augmentation based on a semi-supervised generative model
Yin, Liyan
Xiang, Xin
Liang, Yuan
Liu, Kun
WIRELESS NETWORKS, 2024, 30 (06) : 5683 - 5696
[44] Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval
Kang, Peipei
Lin, Zehang
Yang, Zhenguo
Fang, Xiaozhao
Bronstein, Alexander M.
Li, Qing
Liu, Wenyin
APPLIED INTELLIGENCE, 2022, 52 (01) : 33 - 54
[45] Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval
Peipei Kang
Zehang Lin
Zhenguo Yang
Xiaozhao Fang
Alexander M. Bronstein
Qing Li
Wenyin Liu
Applied Intelligence, 2022, 52 : 33 - 54
[46] Semi-Supervised Encrypted Traffic Classification With Deep Convolutional Generative Adversarial Networks
Iliyasu, Auwal Sani
Deng, Huifang
IEEE ACCESS, 2020, 8 : 118 - 126
[47] S3ACH: Semi-Supervised Semantic Adaptive Cross-Modal Hashing
Yang, Liu
Zhang, Kaiting
Li, Yinan
Chen, Yunfei
Long, Jun
Yang, Zhan
NEURAL INFORMATION PROCESSING, ICONIP 2023, PT IV, 2024, 14450 : 252 - 269
[48] Heterogeneous Anomaly Detection for Software Systems via Semi-supervised Cross-modal Attention
Lee, Cheryl
Yang, Tianyi
Chen, Zhuangbin
Su, Yuxin
Yang, Yongqiang
Lyu, Michael R.
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 1724 - 1736
[49] Semi-supervised cross-modal retrieval with graph-based semantic alignment network
Zhang, Lei
Chen, Leiting
Ou, Weihua
Zhou, Chuan
COMPUTERS & ELECTRICAL ENGINEERING, 2022, 102
[50] Clustering-Based Semi-Supervised Cross-Modal Retrieval Using Scene Graph
Kong, Yixue
Feng, Yong
Zhou, Mingliang
Xiong, Xiancai
Wang, Yongheng
Qiang, Baohua
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (12) : 1299 - 1314

← 1 2 3 4 5 →