MPCSAN: multi-head parallel channel-spatial attention network for facial expression recognition in the wild

被引：3

作者：

Gong, Weijun ^{[1
]}

Qian, Yurong ^{[1
,2
,3
]}

Fan, Yingying ^{[1
]}

机构：

[1] Xinjiang Univ, Sch Informat Sci & Engn, Urumqi 830000, Peoples R China

[2] Xinjiang Univ, Sch Software, Urumqi 830000, Peoples R China

[3] Xinjiang Key Lab Signal Detect & Proc, Urumqi 830000, Peoples R China

来源：

NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 09期

基金：

美国国家科学基金会;

关键词：

Facial expression recognition; Multi-head attention; Feature aggregation; Parallel channel-spatial attention; FEATURES; DEEP;

D O I：

10.1007/s00521-022-08040-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Facial expression recognition (FER) in the wild is an exceedingly challenging task in computer vision due to subtle differences, poses, occlusions, label bias, and other uncontrollable factors. CNN-based deep learning networks are susceptible to the above factors, resulting in the inability to obtain highly discriminative features on the key regions of expressions, and most methods of learning in a single feature space may not fully capture the core regions of interest. These will directly affect the solution to the problem of intra-class variability and inter-class similarity of expressions, which ultimately affects the recognition performance. Therefore, we propose an effective multi-head parallel channel-spatial attention network (MPCSAN) for FER in the wild, which consists of a feature aggregation network (FAN), a multi-head parallel attention network (MPAN), and an expression forecasting network (EFN). First, the lightweight FAN network extracts basic expression features while optimizing intra-class and inter-class distribution. Then, MPAN forms a multi-attention subspace by a multi-head parallel channel-space attention fusion design and focuses on more accurate and comprehensive expression regions of interest by minimizing duplicate attention during subspace fusion. Finally, EFN performs the final expression classification under the optimization of label softening, which further improves the robustness problem caused by label bias. Our proposed method is evaluated on the three most widely used wild expression datasets (RAF-DB, FERPlus, and AffectNet). The extensive experimental results demonstrate that our method outperforms several current state-of-the-art methods, achieving accuracies of 90.16% on RAF-DB, 89.91% on FERPlus, and 61.58% on AffectNet, respectively. Occlusion and pose variation datasets evaluation and cross-dataset assessment further demonstrate the good comprehensive performance of our method.

引用

页码：6529 / 6543

页数：15

共 50 条

[1] MPCSAN: multi-head parallel channel-spatial attention network for facial expression recognition in the wild
Weijun Gong
Yurong Qian
Yingying Fan
[J]. Neural Computing and Applications, 2023, 35 : 6529 - 6543
[2] Local Multi-Head Channel Self-Attention for Facial Expression Recognition
Pecoraro, Roberto
Basile, Valerio
Bono, Viviana
[J]. INFORMATION, 2022, 13 (09)
[3] Distract Your Attention: Multi-Head Cross Attention Network for Facial Expression Recognition
Wen, Zhengyao
Lin, Wenzhong
Wang, Tao
Xu, Ge
[J]. BIOMIMETICS, 2023, 8 (02)
[4] Facial Expression Recognition Based on Fine-Tuned Channel-Spatial Attention Transformer
Yao, Huang
Yang, Xiaomeng
Chen, Di
Wang, Zhao
Tian, Yuan
[J]. SENSORS, 2023, 23 (15)
[5] CSINet: Channel-Spatial Fusion Networks for Asymmetric Facial Expression Recognition
Cheng, Yan
Kong, Defeng
[J]. SYMMETRY-BASEL, 2024, 16 (04):
[6] A facial depression recognition method based on hybrid multi-head cross attention network
Li, Yutong
Liu, Zhenyu
Zhou, Li
Yuan, Xiaoyan
Shangguan, Zixuan
Hu, Xiping
Hu, Bin
[J]. FRONTIERS IN NEUROSCIENCE, 2023, 17
[7] Recognizing facial expressions based on pyramid multi-head grid and spatial attention network
Zhang, Jianyang
Wang, Wei
Li, Xiangyu
Han, Yanjiang
[J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 244
[8] Deep Learning Based Mobilenet and Multi-Head Attention Model for Facial Expression Recognition
Nouisser, Aicha
Zouari, Ramzi
Kherallah, Monji
[J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (3A) : 485 - 491
[9] Classification of Facial Expression In-the-Wild based on Ensemble of Multi-head Cross Attention Networks
Jeong, Jae Yeop
Hong, Yeong-Gi
Kim, Daun
Jeong, Jin-Woo
Jung, Yuchul
Kim, Sang-Ho
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2352 - 2357
[10] Channel-spatial attention network for fewshot classification
Zhang, Yan
Fang, Min
Wang, Nian
[J]. PLOS ONE, 2019, 14 (12):

← 1 2 3 4 5 →