A convolution-transformer dual branch network for head-pose and occlusion facial expression recognition

被引:21
|
作者
Liang, Xingcan [1 ,2 ]
Xu, Linsen [5 ]
Zhang, Wenxiang [3 ]
Zhang, Yan [4 ]
Liu, Jinfu [1 ,2 ]
Liu, Zhipeng [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Intelligent Machines, Hefei Inst Phys Sci, Hefei 230031, Peoples R China
[2] Univ Sci & Technol China, Hefei 230026, Peoples R China
[3] Changzhou Univ, Sch Microelect & Control Engn, Changzhou 213164, Peoples R China
[4] Anhui Jianzhu Univ, Sch Elect & Informat Engn, Hefei 230009, Peoples R China
[5] Hohai Univ, Coll Mech & Elect Engn, Changzhou 213022, Peoples R China
来源
VISUAL COMPUTER | 2023年 / 39卷 / 06期
关键词
Facial expression recognition; CNNs; Transformers; Feature fusion; Robust on occlusions and head-pose variations;
D O I
10.1007/s00371-022-02413-5
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Facial expression recognition (FER) has attracted much more attention due to its broad range of applications. Occlusions and head-pose variations are two major obstacles for automatic FER. In this paper, we propose a convolution-transformer dual branch network (CT-DBN) that takes advantage of local and global facial information to tackle the real-word occlusions and head-pose variant robust FER. The CT-DBN contains two branches. Taking into account local modeling ability of CNN, the first branch utilizes CNN to capture local edge information. Inspired by transformers' successful application in natural language processing, we employ transformer to the second branch to be responsible for obtaining better global representation. Then, a local-global feature fusion module is proposed to adaptively integrate both features to hybrid features and model the relationship between them. With the help of feature fusion module, our network not only integrates local and global features in an adaptive weighting manner but can also learn the corresponding distinguishable features autonomously. Experimental results under inner-database and cross-database evaluation on four leading facial expression databases illustrate that our proposed CT-DBN outperforms other state-of-the-art methods and achieves robust performance under in-the-wild condition.
引用
收藏
页码:2277 / 2290
页数:14
相关论文
共 50 条
  • [1] A convolution-transformer dual branch network for head-pose and occlusion facial expression recognition
    Xingcan Liang
    Linsen Xu
    Wenxiang Zhang
    Yan Zhang
    Jinfu Liu
    Zhipeng Liu
    [J]. The Visual Computer, 2023, 39 (6) : 2277 - 2290
  • [2] Head-pose invariant facial expression recognition using convolutional neural networks
    Fasel, B
    [J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 529 - 534
  • [3] DBCTNet: Double Branch Convolution-Transformer Network for Hyperspectral Image Classification
    Xu, Rui
    Dong, Xue-Mei
    Li, Weijie
    Peng, Jiangtao
    Sun, Weiwei
    Xu, Yi
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [4] Identity-Expression Dual Branch Network for Facial Expression Recognition
    Zhang, Haifeng
    Su, Wen
    Yu, Jun
    Wang, Zengfu
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (04) : 898 - 911
  • [5] TFE: A Transformer Architecture for Occlusion Aware Facial Expression Recognition
    Gao, Jixun
    Zhao, Yuanyuan
    [J]. FRONTIERS IN NEUROROBOTICS, 2021, 15
  • [6] Facial Expression Recognition Based on Convolution Neural Network
    Duan, Yue
    Zhou, Linli
    Wu, Yue
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING, INFORMATION SCIENCE & APPLICATION TECHNOLOGY (ICCIA 2017), 2017, 74 : 339 - 343
  • [7] Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose
    Yaokun Li
    Guang Tan
    Chao Gou
    [J]. International Journal of Computer Vision, 2024, 132 : 1242 - 1257
  • [8] Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose
    Li, Yaokun
    Tan, Guang
    Gou, Chao
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (04) : 1242 - 1257
  • [9] Facial Expression Recognition Robust to Occlusion using Spatial Transformer Network with Triplet Loss Function
    Kim, Jieun
    Lee, Eung-Joo
    Lee, Deokwoo
    [J]. PATTERN RECOGNITION AND TRACKING XXXIII, 2022, 12101
  • [10] Facial expression recognition via a jointly-learned dual-branch network
    Bordjiba, Yamina
    Merouani, Hayet Farida
    Azizi, Nabiha
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2022, 13 (06) : 447 - 456