MPCSAN: multi-head parallel channel-spatial attention network for facial expression recognition in the wild

被引:3
|
作者
Gong, Weijun [1 ]
Qian, Yurong [1 ,2 ,3 ]
Fan, Yingying [1 ]
机构
[1] Xinjiang Univ, Sch Informat Sci & Engn, Urumqi 830000, Peoples R China
[2] Xinjiang Univ, Sch Software, Urumqi 830000, Peoples R China
[3] Xinjiang Key Lab Signal Detect & Proc, Urumqi 830000, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 09期
基金
美国国家科学基金会;
关键词
Facial expression recognition; Multi-head attention; Feature aggregation; Parallel channel-spatial attention; FEATURES; DEEP;
D O I
10.1007/s00521-022-08040-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Facial expression recognition (FER) in the wild is an exceedingly challenging task in computer vision due to subtle differences, poses, occlusions, label bias, and other uncontrollable factors. CNN-based deep learning networks are susceptible to the above factors, resulting in the inability to obtain highly discriminative features on the key regions of expressions, and most methods of learning in a single feature space may not fully capture the core regions of interest. These will directly affect the solution to the problem of intra-class variability and inter-class similarity of expressions, which ultimately affects the recognition performance. Therefore, we propose an effective multi-head parallel channel-spatial attention network (MPCSAN) for FER in the wild, which consists of a feature aggregation network (FAN), a multi-head parallel attention network (MPAN), and an expression forecasting network (EFN). First, the lightweight FAN network extracts basic expression features while optimizing intra-class and inter-class distribution. Then, MPAN forms a multi-attention subspace by a multi-head parallel channel-space attention fusion design and focuses on more accurate and comprehensive expression regions of interest by minimizing duplicate attention during subspace fusion. Finally, EFN performs the final expression classification under the optimization of label softening, which further improves the robustness problem caused by label bias. Our proposed method is evaluated on the three most widely used wild expression datasets (RAF-DB, FERPlus, and AffectNet). The extensive experimental results demonstrate that our method outperforms several current state-of-the-art methods, achieving accuracies of 90.16% on RAF-DB, 89.91% on FERPlus, and 61.58% on AffectNet, respectively. Occlusion and pose variation datasets evaluation and cross-dataset assessment further demonstrate the good comprehensive performance of our method.
引用
收藏
页码:6529 / 6543
页数:15
相关论文
共 50 条
  • [1] MPCSAN: multi-head parallel channel-spatial attention network for facial expression recognition in the wild
    Weijun Gong
    Yurong Qian
    Yingying Fan
    [J]. Neural Computing and Applications, 2023, 35 : 6529 - 6543
  • [2] Local Multi-Head Channel Self-Attention for Facial Expression Recognition
    Pecoraro, Roberto
    Basile, Valerio
    Bono, Viviana
    [J]. INFORMATION, 2022, 13 (09)
  • [3] Distract Your Attention: Multi-Head Cross Attention Network for Facial Expression Recognition
    Wen, Zhengyao
    Lin, Wenzhong
    Wang, Tao
    Xu, Ge
    [J]. BIOMIMETICS, 2023, 8 (02)
  • [4] Facial Expression Recognition Based on Fine-Tuned Channel-Spatial Attention Transformer
    Yao, Huang
    Yang, Xiaomeng
    Chen, Di
    Wang, Zhao
    Tian, Yuan
    [J]. SENSORS, 2023, 23 (15)
  • [5] CSINet: Channel-Spatial Fusion Networks for Asymmetric Facial Expression Recognition
    Cheng, Yan
    Kong, Defeng
    [J]. SYMMETRY-BASEL, 2024, 16 (04):
  • [6] A facial depression recognition method based on hybrid multi-head cross attention network
    Li, Yutong
    Liu, Zhenyu
    Zhou, Li
    Yuan, Xiaoyan
    Shangguan, Zixuan
    Hu, Xiping
    Hu, Bin
    [J]. FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [7] Recognizing facial expressions based on pyramid multi-head grid and spatial attention network
    Zhang, Jianyang
    Wang, Wei
    Li, Xiangyu
    Han, Yanjiang
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 244
  • [8] Deep Learning Based Mobilenet and Multi-Head Attention Model for Facial Expression Recognition
    Nouisser, Aicha
    Zouari, Ramzi
    Kherallah, Monji
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (3A) : 485 - 491
  • [9] Classification of Facial Expression In-the-Wild based on Ensemble of Multi-head Cross Attention Networks
    Jeong, Jae Yeop
    Hong, Yeong-Gi
    Kim, Daun
    Jeong, Jin-Woo
    Jung, Yuchul
    Kim, Sang-Ho
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2352 - 2357
  • [10] Channel-spatial attention network for fewshot classification
    Zhang, Yan
    Fang, Min
    Wang, Nian
    [J]. PLOS ONE, 2019, 14 (12):