Semi-supervised classification-aware cross-modal deep adversarial data augmentation

被引:8
|
作者
Wang, Shaoqiang [1 ]
Wu, Zhenzhen [2 ]
He, Gewen [3 ]
Wang, Shudong [1 ]
Sun, Hongwei [2 ]
Fan, Fangfang [4 ]
机构
[1] China Univ Petr, Sch Comp & Commun Engn, Qingdao 266000, Peoples R China
[2] Weifang Univ Sci & Technol, Shandong Prov Univ Lab Protected Hort, Weifang 262700, Peoples R China
[3] Florida State Univ, Dept Comp Sci, Tallahassee, FL 32306 USA
[4] Harvard Univ, Harvard Med Sch, Cambridge, MA 02215 USA
关键词
Adversarial network; Data augmentation; Density estimation; Graph representation; Semi supervised learning;
D O I
10.1016/j.future.2021.05.029
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Deep neural networks are usually data-starved in real-world applications, while manually annotation can be costly-for example, the audio emotion recognition from the audio. In contrast, the continued research in image-based facial expression recognition grants us a rich source of public available labeled IFER datasets. Using images to support audio emotion recognition with limited labeled data according to their inherent correlations can be a meaningful and challenging task. This paper proposes a system that facilitates knowledge transfer from the labeled visual to the heterogeneous labeled audio domain by learning a joint distribution of examples in different modalities then the system can map an IFER example to a corresponding audio spectrogram. Next, our work reformulates the audio emotion classification into a K+1 class discriminator of GAN-based semi-supervised learning. Good semi-supervised learning requires that the generator does NOT sample from a distribution well matching the true data distribution. Therefore, we demand the generated examples are from the low-density areas of the marginal distribution in the audio spectrogram modality. Concretely, the proposed model translates image samples to audios class-wisely in the form of spectrograms. To harness the decoded samples in a sparsely distributed area and construct a tighter decision boundary, we give a solution to precisely estimate the density on feature space and incorporate low-density pieces with an annealing scheme. Our method requires the network to discriminate against the low-density data points from high-density data points throughout the classification, and we evidence that this technique effectively improves task performance. (C) 2021 Published by Elsevier B.V.
引用
收藏
页码:194 / 205
页数:12
相关论文
共 50 条
  • [31] Supervised cross-modal factor analysis for multiple modal data classification
    Wang, Jingbin
    Zhou, Yihua
    Duan, Kanghong
    Wang, Jim Jing-Yan
    Bensmail, Halima
    2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 1882 - 1888
  • [32] Data Augmentation for Graph Convolutional Network on Semi-supervised Classification
    Tang, Zhengzheng
    Qiao, Ziyue
    Hong, Xuehai
    Wang, Yang
    Dharejo, Fayaz Ali
    Zhou, Yuanchun
    Du, Yi
    WEB AND BIG DATA, APWEB-WAIM 2021, PT II, 2021, 12859 : 33 - 48
  • [33] GSDA: Generative adversarial network-based semi-supervised data augmentation for ultrasound image classification
    Liu, Zhaoshan
    Lv, Qiujie
    Lee, Chau Hung
    Shen, Lei
    HELIYON, 2023, 9 (09)
  • [34] Deep Supervised Cross-modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Wang, Xu
    Peng, Dezhong
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10386 - 10395
  • [35] SEMI-SUPERVISED SEMANTIC-PRESERVING HASHING FOR EFFICIENT CROSS-MODAL RETRIEVAL
    Wang, Xingzhi
    Liu, Xin
    Hu, Zhikai
    Wang, Nannan
    Fan, Wentao
    Du, Ji-Xiang
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1006 - 1011
  • [36] Semi-supervised cross-modal hashing with multi-view graph representation
    Shen, Xiao
    Zhang, Haofeng
    Li, Lunbo
    Yang, Wankou
    Liu, Li
    INFORMATION SCIENCES, 2022, 604 : 45 - 60
  • [37] Semi-supervised Cross-Modal Hashing Based on Label Prediction and Distance Preserving
    Zhang, Xu
    Tian, Xin
    Yang, Bing
    Zhang, Zuyu
    Li, Yan
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 324 - 330
  • [38] Semi-supervised Prototype Semantic Association Learning for Robust Cross-modal Retrieval
    Wang, Junsheng
    Gong, Tiantian
    Yan, Yan
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 872 - 881
  • [39] Semi-Supervised Deep Adversarial Forest for Cross-Environment Localization
    Cui, Wei
    Zhang, Le
    Li, Bing
    Chen, Zhenghua
    Wu, Min
    Li, Xiaoli
    Kang, Jiawen
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (09) : 10215 - 10219
  • [40] Multi-Level Cross-Modal Interactive-Network-Based Semi-Supervised Multi-Modal Ship Classification
    The School of Software Technology, Dalian University of Technology, Dalian
    116621, China
    Sensors, 2024, 22