A cascaded spatiotemporal attention network for dynamic facial expression recognition

被引:3
|
作者
Ye, Yaoguang [1 ]
Pan, Yongqi [2 ]
Liang, Yan [1 ]
Pan, Jiahui [1 ,3 ]
机构
[1] South China Normal Univ, Sch Software, Foshan 528200, Peoples R China
[2] South China Agr Univ, Coll Math & Informat, Coll Software Engn, Guangzhou 510642, Peoples R China
[3] Pazhou Lab, Guangzhou 510330, Peoples R China
基金
中国国家自然科学基金;
关键词
Dynamic facial expression recognition; Spatiotemporal features; Cascaded network; Attention module; ACTION-UNITS;
D O I
10.1007/s10489-022-03781-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dynamic facial expression recognition (DFER) is a promising research area because it concerns the dynamic change pattern of facial expressions, but it is difficult to effectively capture the facial appearances and dynamic temporal information of each image in an image sequence. In this paper, a cascaded spatiotemporal attention network (CSTAN) is proposed to learn and integrate spatial and temporal emotional information in the process of facial expression change. Three types of attention modules are embedded into the cascaded network to enable it to extract more informative spatiotemporal features for the DFER task in different dimensions. A channel attention module helps the network focus on the meaningful spatial feature maps for the DFER task, a spatial attention module focuses on the regions of interest among the spatial feature maps, and a temporal attention module aims to explore the dynamic temporal information when an expression changes. The experimental results on three public facial expression recognition datasets prove the good performance of the CSTAN, and it can extract representative spatiotemporal features. Meanwhile, the visualization results reveal that the CSTAN can locate regions of interest and contributing timesteps, which illustrates the effectiveness of the multidimensional attention modules.
引用
收藏
页码:5402 / 5415
页数:14
相关论文
共 50 条
  • [1] A cascaded spatiotemporal attention network for dynamic facial expression recognition
    Yaoguang Ye
    Yongqi Pan
    Yan Liang
    Jiahui Pan
    [J]. Applied Intelligence, 2023, 53 : 5402 - 5415
  • [2] Attend to Where and When: Cascaded Attention Network for Facial Expression Recognition
    Qu, Xiaoye
    Zou, Zhikang
    Su, Xinxing
    Zhou, Pan
    Wei, Wei
    Wen, Shiping
    Wu, Dapeng
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2022, 6 (03): : 580 - 592
  • [3] Facial expression recognition with dynamic cascaded classifier
    Ashir, Abubakar M.
    Eleyan, Alaa
    Akdemir, Bayram
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (10): : 6295 - 6309
  • [4] Facial expression recognition with dynamic cascaded classifier
    Abubakar M. Ashir
    Alaa Eleyan
    Bayram Akdemir
    [J]. Neural Computing and Applications, 2020, 32 : 6295 - 6309
  • [5] STAN: spatiotemporal attention network for video-based facial expression recognition
    Yufan Yi
    Yiping Xu
    Ziyi Ye
    Linhui Li
    Xinli Hu
    Yan Tian
    [J]. The Visual Computer, 2023, 39 : 6205 - 6220
  • [6] STAN: spatiotemporal attention network for video-based facial expression recognition
    Yi, Yufan
    Xu, Yiping
    Ye, Ziyi
    Li, Linhui
    Hu, Xinli
    Tian, Yan
    [J]. VISUAL COMPUTER, 2023, 39 (12): : 6205 - 6220
  • [7] Multiple Attention Network for Facial Expression Recognition
    Gan, Yanling
    Chen, Jingying
    Yang, Zongkai
    Xu, Luhui
    [J]. IEEE ACCESS, 2020, 8 : 7383 - 7393
  • [8] Multimodal Attention Dynamic Fusion Network for Facial Micro-Expression Recognition
    Yang, Hongling
    Xie, Lun
    Pan, Hang
    Li, Chiqin
    Wang, Zhiliang
    Zhong, Jialiang
    [J]. ENTROPY, 2023, 25 (09)
  • [9] Cross-view adaptive graph attention network for dynamic facial expression recognition
    Li, Yan
    Xi, Min
    Jiang, Dongmei
    [J]. MULTIMEDIA SYSTEMS, 2023, 29 (5) : 2715 - 2728
  • [10] Cross-view adaptive graph attention network for dynamic facial expression recognition
    Yan Li
    Min Xi
    Dongmei Jiang
    [J]. Multimedia Systems, 2023, 29 : 2715 - 2728