J(A)over-capA-Net: Joint Facial Action Unit Detection and Face Alignment Via Adaptive Attention

被引：58

作者：

Shao, Zhiwen ^{[1
,2
,3
]}

Liu, Zhilei ^{[4
]}

Cai, Jianfei ^{[5
]}

Ma, Lizhuang ^{[3
,6
]}

机构：

[1] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China

[2] Minist Educ Peoples Republ China, Engn Res Ctr Mine Digitizat, Xuzhou 221116, Jiangsu, Peoples R China

[3] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China

[4] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China

[5] Monash Univ, Fac Informat Technol, Clayton, Vic 3800, Australia

[6] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200062, Peoples R China

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2021年 / 129卷 / 02期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Joint learning; Facial AU detection; Face alignment; Adaptive attention learning; 3D;

D O I：

10.1007/s11263-020-01378-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Facial action unit (AU) detection and face alignment are two highly correlated tasks, since facial landmarks can provide precise AU locations to facilitate the extraction of meaningful local features for AU detection. However, most existing AU detection works handle the two tasks independently by treating face alignment as a preprocessing, and often use landmarks to predefine a fixed region or attention for each AU. In this paper, we propose a novel end-to-end deep learning framework for joint AU detection and face alignment, which has not been explored before. In particular, multi-scale shared feature is learned firstly, and high-level feature of face alignment is fed into AU detection. Moreover, to extract precise local features, we propose an adaptive attention learning module to refine the attention map of each AU adaptively. Finally, the assembled local features are integrated with face alignment feature and global feature for AU detection. Extensive experiments demonstrate that our framework (i) significantly outperforms the state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and BP4D+ benchmarks, (ii) can adaptively capture the irregular region of each AU, (iii) achieves competitive performance for face alignment, and (iv) also works well under partial occlusions and non-frontal poses. The code for our method is available at https://github.com/ZhiwenShao/PyTorch-JAANet.

引用

页码：321 / 340

页数：20

共 5 条

[1] JÂA-Net: Joint Facial Action Unit Detection and Face Alignment Via Adaptive Attention
Zhiwen Shao
Zhilei Liu
Jianfei Cai
Lizhuang Ma
[J]. International Journal of Computer Vision, 2021, 129 : 321 - 340
[2] Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment
Shao, Zhiwen
Liu, Zhilei
Cai, Jianfei
Ma, Lizhuang
[J]. COMPUTER VISION - ECCV 2018, PT XIII, 2018, 11217 : 725 - 740
[3] Facial Action Unit Detection via Adaptive Attention and Relation
Shao, Zhiwen
Zhou, Yong
Cai, Jianfei
Zhu, Hancheng
Yao, Rui
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3354 - 3366
[4] SAT-Net: Self-Attention and Temporal Fusion for Facial Action Unit Detection
Li, Zhihua
Zhang, Zheng
Yin, Lijun
[J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5036 - 5043
[5] MMA-Net: Multi-view mixed attention mechanism for facial action unit detection
Shang, Ziqiao
Du, Congju
Li, Bingyin
Yan, Zengqiang
Yu, Li
[J]. PATTERN RECOGNITION LETTERS, 2023, 172 : 165 - 171

← 1 →