SAT-Net: Self-Attention and Temporal Fusion for Facial Action Unit Detection

被引:2
|
作者
Li, Zhihua [1 ]
Zhang, Zheng [1 ]
Yin, Lijun [1 ]
机构
[1] SUNY Binghamton, Dept Comp Sci, Binghamton, NY 13902 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICPR48806.2021.9413260
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research on facial action unit detection has shown remarkable performances by using deep spatial learning models in recent years, however, it is far from reaching its full capacity in learning due to the lack of use of temporal information of AUs across time. Since the AU occurrence in one frame is highly likely related to previous frames in a temporal sequence, exploring temporal correlation of AUs across frames becomes a key motivation of this work. In this paper, we propose a novel temporal fusion and AU-supervised self-attention network (a so-called SAT-Net) to address the AU detection problem. First of all, we input the deep features of a sequence into a convolutional LSTM network and fuse the previous temporal information into the feature map of the last frame, and continue to learn the AU occurrence. Second, considering the AU detection problem is a multi-label classification problem that individual label depends only on certain facial areas, we propose a new self-learned attention mask by focusing the detection of each AU on parts of facial areas through the learning of individual attention mask for each AU, thus increasing the AU independence without the loss of any spatial relations. Our extensive experiments show that the proposed framework achieves better results of AU detection over the state-of-the-arts on two benchmark databases (BP4D and DISFA).
引用
收藏
页码:5036 / 5043
页数:8
相关论文
共 50 条
  • [1] Facial Action Unit Recognition Based on Self-Attention Spatiotemporal Fusion
    Liang, Chaolei
    Zou, Wei
    Hu, Danfeng
    Wang, JiaJun
    [J]. 2024 5TH INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKS AND INTERNET OF THINGS, CNIOT 2024, 2024, : 600 - 605
  • [2] SAT-Net: a side attention network for retinal image segmentation
    Huilin Tong
    Zhijun Fang
    Ziran Wei
    Qingping Cai
    Yongbin Gao
    [J]. Applied Intelligence, 2021, 51 : 5146 - 5156
  • [3] SAT-Net: a side attention network for retinal image segmentation
    Tong, Huilin
    Fang, Zhijun
    Wei, Ziran
    Cai, Qingping
    Gao, Yongbin
    [J]. APPLIED INTELLIGENCE, 2021, 51 (07) : 5146 - 5156
  • [4] Dual Stream Spatio-Temporal Motion Fusion With Self-Attention For Action Recognition
    Jalal, Md Asif
    Aftab, Waqas
    Moore, Roger K.
    Mihaylova, Lyudmila
    [J]. 2019 22ND INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2019), 2019,
  • [5] Spatio-Temporal Action Detector with Self-Attention
    Ma, Xurui
    Luo, Zhigang
    Zhang, Xiang
    Liao, Qing
    Shen, Xingyu
    Wang, Mengzhu
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [6] EMS-NET: EFFICIENT MULTI-TEMPORAL SELF-ATTENTION FOR HYPERSPECTRAL CHANGE DETECTION
    Hu, Meiqi
    Wu, Chen
    Du, Bo
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6664 - 6667
  • [7] Spatial-Temporal Action Localization With Hierarchical Self-Attention
    Pramono, Rizard Renanda Adhi
    Chen, Yie-Tarng
    Fang, Wen-Hsien
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 625 - 639
  • [8] A Self-Attention Feature Fusion Model for Rice Pest Detection
    Li, Shuaifeng
    Wang, Heng
    Zhang, Cong
    Liu, Jie
    [J]. IEEE ACCESS, 2022, 10 : 84063 - 84077
  • [9] SAT: Self-Attention Control for Diffusion Models Training
    Huang, Jing
    Zhang, Tianyi
    Shi, Wei
    [J]. PROCEEDINGS OF THE 1ST WORKSHOP ON LARGE GENERATIVE MODELS MEET MULTIMODAL APPLICATIONS, LGM3A 2023, 2023, : 15 - 22
  • [10] Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition
    Xiang, Wangmeng
    Li, Chao
    Wang, Biao
    Wei, Xihan
    Hua, Xian-Sheng
    Zhang, Lei
    [J]. COMPUTER VISION - ECCV 2022, PT III, 2022, 13663 : 627 - 644