Facial Action Unit Detection Using Attention and Relation Learning

被引:0
|
作者
Shao, Zhiwen [1 ,2 ]
Liu, Zhilei [3 ]
Cai, Jianfei [4 ]
Wu, Yunsheng [5 ]
Ma, Lizhuang [1 ,2 ,6 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, MoE Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China
[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China
[4] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
[5] Tencent Inc, YouTu Lab, Shanghai 200233, Peoples R China
[6] East China Normal Univ, Sch Comp Sci & Software Engn, Shanghai 200062, Peoples R China
基金
中国国家自然科学基金;
关键词
Gold; Feature extraction; Estimation; Face; Deep learning; Computer science; Learning systems; Channel-wise and spatial attention learning; pixel-level relation learning; facial AU detection;
D O I
10.1109/TAFFC.2019.2948635
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Attention mechanism has recently attracted increasing attentions in the field of facial action unit (AU) detection. By finding the region of interest of each AU with the attention mechanism, AU-related local features can be captured. Most of the existing attention based AU detection works use prior knowledge to predefine fixed attentions or refine the predefined attentions within a small range, which limits their capacity to model various AUs. In this paper, we propose an end-to-end deep learning based attention and relation learning framework for AU detection with only AU labels, which has not been explored before. In particular, multi-scale features shared by each AU are learned first, and then both channel-wise and spatial attentions are adaptively learned to select and extract AU-related local features. Moreover, pixel-level relations for AUs are further captured to refine spatial attentions so as to extract more relevant local features. Without changing the network architecture, our framework can be easily extended for AU intensity estimation. Extensive experiments show that our framework (i) soundly outperforms the state-of-the-art methods for both AU detection and AU intensity estimation on the challenging BP4D, DISFA, FERA 2015, and BP4D+ benchmarks, (ii) can adaptively capture the correlated regions of each AU, and (iii) also works well under severe occlusions and large poses.
引用
收藏
页码:1274 / 1289
页数:16
相关论文
共 50 条
  • [1] Facial Action Unit Detection via Adaptive Attention and Relation
    Shao, Zhiwen
    Zhou, Yong
    Cai, Jianfei
    Zhu, Hancheng
    Yao, Rui
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3354 - 3366
  • [2] Semantic Learning for Facial Action Unit Detection
    Wang, Xuehan
    Chen, C. L. Philip
    Yuan, Haozhang
    Zhang, Tong
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2023, 10 (03) : 1372 - 1380
  • [3] Heterogeneous spatio-temporal relation learning network for facial action unit detection
    Song, Wenyu
    Shi, Shuze
    Dong, Yu
    An, Gaoyun
    [J]. PATTERN RECOGNITION LETTERS, 2022, 164 : 268 - 275
  • [4] Learning Guided Attention Masks for Facial Action Unit Recognition
    Lakshminarayana, Nagashri
    Setlur, Srirangaraj
    Govindaraju, Venu
    [J]. 2020 15TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2020), 2020, : 465 - 472
  • [5] Meta Auxiliary Learning for Facial Action Unit Detection
    Li, Yong
    Shan, Shiguang
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 2526 - 2538
  • [6] Dual-attention guided network for facial action unit detection
    Song, Wenyu
    Shi, Shuze
    Wu, Yuxuan
    An, Gaoyun
    [J]. IET IMAGE PROCESSING, 2022, 16 (08) : 2157 - 2170
  • [7] Learning to combine local models for Facial Action Unit detection
    Jaiswal, Shashank
    Martinez, Brais
    Valstar, Michel F.
    [J]. 2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 6, 2015,
  • [8] On Multi-task Learning for Facial Action Unit Detection
    Zhang, Xiao
    Mahoor, Mohammad H.
    Nielsen, Rodney D.
    [J]. PROCEEDINGS OF 2013 28TH INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ 2013), 2013, : 202 - 207
  • [9] Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection
    Liu, Zhilei
    Dong, Jiahui
    Zhang, Cuicui
    Wang, Longbiao
    Dang, Jianwu
    [J]. MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 : 489 - 501
  • [10] Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment
    Shao, Zhiwen
    Liu, Zhilei
    Cai, Jianfei
    Ma, Lizhuang
    [J]. COMPUTER VISION - ECCV 2018, PT XIII, 2018, 11217 : 725 - 740