Cross-Modal Diversity-Based Active Learning for Multi-Modal Emotion Estimation

被引:0
|
作者
Xu, Yifan [1 ]
Meng, Lubin [1 ]
Peng, Ruimin [1 ]
Yin, Yingjie [2 ]
Ding, Jingting [2 ]
Li, Liang [2 ]
Wu, Dongrui [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
[2] Ant Grp, Beijing, Peoples R China
关键词
Active learning; unsupervised learning; multi-modal learning; emotion recognition;
D O I
10.1109/IJCNN54540.2023.10191581
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition is an important part of affective computing. Utilizing information from multiple modalities would facilitate more accurate emotion recognition. The performance of data-driven machine learning models usually relies on a large amount of labeled training data. However, labeling emotional data is expensive, because each sample usually requires multiple evaluators to annotate. To alleviate the annotation cost, this paper proposes a cross-modal diversity measure that considers the correlation between different modalities and integrates it with the representativeness for sample selection in unsupervised active learning (AL) for regression. To our knowledge, this challenging multi-modal unsupervised AL scenario has not been explored before: previous research only considered either unsupervised uni-modal AL or supervised multi-modal AL. Experiments on RECOLA and IEMOCAP datasets demonstrated the effectiveness of our proposed AL approach.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Cross-modal dynamic convolution for multi-modal emotion recognition
    Wen, Huanglu
    You, Shaodi
    Fu, Ying
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 78
  • [2] Contextual and Cross-Modal Interaction for Multi-Modal Speech Emotion Recognition
    Yang, Dingkang
    Huang, Shuai
    Liu, Yang
    Zhang, Lihua
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2093 - 2097
  • [3] CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations
    Zolfaghari, Mohammadreza
    Zhu, Yi
    Gehler, Peter
    Brox, Thomas
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1430 - 1439
  • [4] Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching
    Liang, Jingjun
    Li, Ruichen
    Jin, Qin
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2852 - 2861
  • [5] Cross-modal attention for multi-modal image registration
    Song, Xinrui
    Chao, Hanqing
    Xu, Xuanang
    Guo, Hengtao
    Xu, Sheng
    Turkbey, Baris
    Wood, Bradford J.
    Sanford, Thomas
    Wang, Ge
    Yan, Pingkun
    [J]. MEDICAL IMAGE ANALYSIS, 2022, 82
  • [6] Multi-modal and cross-modal for lecture videos retrieval
    Nhu Van Nguyen
    Coustaty, Mickal
    Ogier, Jean-Marc
    [J]. 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 2667 - 2672
  • [7] Unsupervised Multi-modal Hashing for Cross-Modal Retrieval
    Jun Yu
    Xiao-Jun Wu
    Donglin Zhang
    [J]. Cognitive Computation, 2022, 14 : 1159 - 1171
  • [8] Unsupervised Multi-modal Hashing for Cross-Modal Retrieval
    Yu, Jun
    Wu, Xiao-Jun
    Zhang, Donglin
    [J]. COGNITIVE COMPUTATION, 2022, 14 (03) : 1159 - 1171
  • [9] Multi-modal semantic autoencoder for cross-modal retrieval
    Wu, Yiling
    Wang, Shuhui
    Huang, Qingming
    [J]. NEUROCOMPUTING, 2019, 331 : 165 - 175
  • [10] Cross-Modal Retrieval Augmentation for Multi-Modal Classification
    Gur, Shir
    Neverova, Natalia
    Stauffer, Chris
    Lim, Ser-Nam
    Kiela, Douwe
    Reiter, Austin
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 111 - 123