Audio-visual domain adaptation using conditional semi-supervised Generative Adversarial Networks

被引:21
|
作者
Athanasiadis, Christos [1 ]
Hortal, Enrique [1 ]
Asteriadis, Stylianos [1 ]
机构
[1] Maastricht Univ, Dept Data Sci & Knowledge Engn, Sint Servaasklooster 39, NL-6211 TE Maastricht, Netherlands
基金
欧盟地平线“2020”;
关键词
Domain adaptation; Conformal prediction; Generative adversarial; Networks; FACIAL EXPRESSION RECOGNITION; FACE;
D O I
10.1016/j.neucom.2019.09.106
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accessing large, manually annotated audio databases in an effort to create robust models for emotion recognition is a notably difficult task, handicapped by the annotation cost and label ambiguities. On the contrary, there are plenty of publicly available datasets for emotion recognition which are based on facial expressivity due to the prevailing role of computer vision in deep learning research, nowadays. Thereby, in the current work, we performed a study on cross-modal transfer knowledge between audio and facial modalities within the emotional context. More concretely, we investigated whether facial information from videos could be used to boost the awareness and the prediction tracking of emotions in audio signals. Our approach was based on a simple hypothesis: that the emotional state's content of a person's oral expression correlates with the corresponding facial expressions. Research in the domain of cognitive psychology was affirmative to our hypothesis and suggests that visual information related to emotions fused with the auditory signal is used from humans in a cross-modal integration schema to better understand emotions. In this regard, a method called dacssGAN (which stands for Domain Adaptation Conditional Semi-Supervised Generative Adversarial Networks) is introduced in this work, in an effort to bridge these two inherently different domains. Given as input the source domain (visual data) and some conditional information that is based on inductive conformal prediction, the proposed architecture generates data distributions that are as close as possible to the target domain (audio data). Through experimentation, it is shown that classification performance of an expanded dataset using real audio enhanced with generated samples produced using dacssGAN (50.29% and 48.65%) outperforms the one obtained merely using real audio samples (49.34% and 46.90%) for two publicly available audio-visual emotion datasets. (C) 2019 The Authors. Published by Elsevier B.V.
引用
收藏
页码:331 / 344
页数:14
相关论文
共 50 条
  • [1] Semi-supervised Text Regression with Conditional Generative Adversarial Networks
    Li, Tao
    Liu, Xudong
    Su, Shihan
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 5375 - 5377
  • [2] Semi-supervised Learning Using Generative Adversarial Networks
    Chang, Chuan-Yu
    Chen, Tzu-Yang
    Chung, Pau-Choo
    [J]. 2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 892 - 896
  • [3] Semi-supervised adversarial discriminative domain adaptation
    Thai-Vu Nguyen
    Anh Nguyen
    Nghia Le
    Bac Le
    [J]. Applied Intelligence, 2023, 53 : 15909 - 15922
  • [4] Semi-supervised adversarial discriminative domain adaptation
    Nguyen, Thai-Vu
    Nguyen, Anh
    Le, Nghia
    Le, Bac
    [J]. APPLIED INTELLIGENCE, 2023, 53 (12) : 15909 - 15922
  • [5] Medical Image Segmentation Using Semi-supervised Conditional Generative Adversarial Nets
    Liu S.-P.
    Hong J.-M.
    Liang J.-P.
    Jia X.-P.
    Ouyang J.
    Yin J.
    [J]. Ruan Jian Xue Bao/Journal of Software, 2020, 31 (08): : 2588 - 2602
  • [6] Semi-supervised Seizure Prediction with Generative Adversarial Networks
    Nhan Duy Truong
    Zhou, Luping
    Kavehei, Omid
    [J]. 2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2019, : 2369 - 2372
  • [7] Semi-Supervised Learning with Coevolutionary Generative Adversarial Networks
    Toutouh, Jamal
    Nalluru, Subhash
    Hemberg, Erik
    O'Reilly, Una-May
    [J]. PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, GECCO 2023, 2023, : 568 - 576
  • [8] Pulsar candidate identification using semi-supervised generative adversarial networks
    Balakrishnan, Vishnu
    Champion, David
    Barr, Ewan
    Kramer, Michael
    Sengar, Rahul
    Bailes, Matthew
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2021, 505 (01) : 1180 - 1194
  • [9] Semi-supervised image attribute editing using generative adversarial networks
    Dogan, Yahya
    Keles, Hacer Yalim
    [J]. NEUROCOMPUTING, 2020, 401 (401) : 338 - 352
  • [10] Localizing Microseismic Events Using Semi-Supervised Generative Adversarial Networks
    Feng, Qiang
    Han, Liguo
    Zhao, Binghui
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60