Continuous affect recognition with weakly supervised learning

被引:8
|
作者
Pei, Ercheng [1 ]
Jiang, Dongmei [1 ]
Alioscha-Perez, Mitchel [2 ]
Sahli, Hichem [2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, VUB NPU Joint AVSP Lab, Xian 710072, Peoples R China
[2] Vrije Univ Brussel, Dept ETRO, Pl Laan 2, B-1050 Brussels, Belgium
关键词
Continuous affect recognition; DNN-BLSTM; Weak supervision; FEATURE ENHANCEMENT; LSTM; CLASSIFICATION; NETWORKS; FEATURES;
D O I
10.1007/s11042-019-7313-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recognizing a person's affective state from audio-visual signals is an essential capability for intelligent interaction. Insufficient training data and the unreliable labels of affective dimensions (e.g., valence and arousal) are two major challenges in continuous affect recognition. In this paper, we propose a weakly supervised learning approach based on hybrid deep neural network and bidirectional long short-term memory recurrent neural network (DNN-BLSTM). It firstly maps the audio/visual features into a more discriminative space via the powerful modelling capacities of DNN, then models the temporal dynamics of affect via BLSTM. To reduce the negative impact of the unreliable labels, we utilize a temporal label (TL) along with a robust loss function (RL) for incorporating weak supervision into the learning process of the DNN-BLSTM model. Therefore, the proposed method not only has a simpler structure than the deep BLSTM model in He et al. (24) which requires more training data, but also is robust to noisy and unreliable labels. Single modal and multimodal affect recognition experiments have been carried out on the RECOLA dataset. Single modal recognition results show that the proposed method with TL and RL obtains remarkable improvements on both arousal and valence in terms of concordance correlation coefficient (CCC), while multimodal recognition results show that with less feature streams, our proposed approach obtains better or comparable results with the state-of-the-art methods.
引用
收藏
页码:19387 / 19412
页数:26
相关论文
共 50 条
  • [21] Weakly Supervised Regional and Temporal Learning for Facial Action Unit Recognition
    Yan, Jingwei
    Wang, Jingjing
    Li, Qiang
    Wang, Chunmao
    Pu, Shiliang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1760 - 1772
  • [22] Weakly Supervised Recognition of Surgical Gestures
    van Amsterdam, Beatrice
    Nakawala, Hirenkumar
    De Momi, Elena
    Stoyanov, Danail
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 9565 - 9571
  • [23] Weakly Supervised Correspondence Learning
    Wang, Zihan
    Cao, Zhangjie
    Hao, Yilun
    Sadigh, Dorsa
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022,
  • [24] Safe Weakly Supervised Learning
    Li, Yu-Feng
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4951 - 4955
  • [25] Weakly supervised machine learning
    Ren, Zeyu
    Wang, Shuihua
    Zhang, Yudong
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (03) : 549 - 580
  • [26] Weakly Supervised Dictionary Learning
    You, Zeyu
    Raich, Raviv
    Fern, Xiaoli Z.
    Kim, Jinsub
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2018, 66 (10) : 2527 - 2541
  • [27] Weakly Supervised Contrastive Learning
    Zheng, Mingkai
    Wang, Fei
    You, Shan
    Qian, Chen
    Zhang, Changshui
    Wang, Xiaogang
    Xu, Chang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10022 - 10031
  • [28] TCL: Tightly Coupled Learning Strategy for Weakly Supervised Hierarchical Place Recognition
    Shen, Yanqing
    Wang, Ruotong
    Zuo, Weiliang
    Zheng, Nanning
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 2684 - 2691
  • [29] Weakly-Supervised Cross-Domain Dictionary Learning for Visual Recognition
    Zhu, Fan
    Shao, Ling
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 109 (1-2) : 42 - 59
  • [30] Learning Signs from Subtitles: A Weakly Supervised Approach to Sign Language Recognition
    Cooper, Helen
    Bowden, Richard
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 2560 - 2566