A convolution-transformer dual branch network for head-pose and occlusion facial expression recognition

被引：21

作者：

Liang, Xingcan ^{[1
,2
]}

Xu, Linsen ^{[5
]}

Zhang, Wenxiang ^{[3
]}

Zhang, Yan ^{[4
]}

Liu, Jinfu ^{[1
,2
]}

Liu, Zhipeng ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Intelligent Machines, Hefei Inst Phys Sci, Hefei 230031, Peoples R China

[2] Univ Sci & Technol China, Hefei 230026, Peoples R China

[3] Changzhou Univ, Sch Microelect & Control Engn, Changzhou 213164, Peoples R China

[4] Anhui Jianzhu Univ, Sch Elect & Informat Engn, Hefei 230009, Peoples R China

[5] Hohai Univ, Coll Mech & Elect Engn, Changzhou 213022, Peoples R China

来源：

VISUAL COMPUTER | 2023年 / 39卷 / 06期

关键词：

Facial expression recognition; CNNs; Transformers; Feature fusion; Robust on occlusions and head-pose variations;

D O I：

10.1007/s00371-022-02413-5

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Facial expression recognition (FER) has attracted much more attention due to its broad range of applications. Occlusions and head-pose variations are two major obstacles for automatic FER. In this paper, we propose a convolution-transformer dual branch network (CT-DBN) that takes advantage of local and global facial information to tackle the real-word occlusions and head-pose variant robust FER. The CT-DBN contains two branches. Taking into account local modeling ability of CNN, the first branch utilizes CNN to capture local edge information. Inspired by transformers' successful application in natural language processing, we employ transformer to the second branch to be responsible for obtaining better global representation. Then, a local-global feature fusion module is proposed to adaptively integrate both features to hybrid features and model the relationship between them. With the help of feature fusion module, our network not only integrates local and global features in an adaptive weighting manner but can also learn the corresponding distinguishable features autonomously. Experimental results under inner-database and cross-database evaluation on four leading facial expression databases illustrate that our proposed CT-DBN outperforms other state-of-the-art methods and achieves robust performance under in-the-wild condition.

引用

页码：2277 / 2290

页数：14

共 50 条

[1] A convolution-transformer dual branch network for head-pose and occlusion facial expression recognition
Xingcan Liang
Linsen Xu
Wenxiang Zhang
Yan Zhang
Jinfu Liu
Zhipeng Liu
[J]. The Visual Computer, 2023, 39 (6) : 2277 - 2290
[2] Head-pose invariant facial expression recognition using convolutional neural networks
Fasel, B
[J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 529 - 534
[3] DBCTNet: Double Branch Convolution-Transformer Network for Hyperspectral Image Classification
Xu, Rui
Dong, Xue-Mei
Li, Weijie
Peng, Jiangtao
Sun, Weiwei
Xu, Yi
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
[4] Identity-Expression Dual Branch Network for Facial Expression Recognition
Zhang, Haifeng
Su, Wen
Yu, Jun
Wang, Zengfu
[J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (04) : 898 - 911
[5] TFE: A Transformer Architecture for Occlusion Aware Facial Expression Recognition
Gao, Jixun
Zhao, Yuanyuan
[J]. FRONTIERS IN NEUROROBOTICS, 2021, 15
[6] Facial Expression Recognition Based on Convolution Neural Network
Duan, Yue
Zhou, Linli
Wu, Yue
[J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING, INFORMATION SCIENCE & APPLICATION TECHNOLOGY (ICCIA 2017), 2017, 74 : 339 - 343
[7] Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose
Yaokun Li
Guang Tan
Chao Gou
[J]. International Journal of Computer Vision, 2024, 132 : 1242 - 1257
[8] Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose
Li, Yaokun
Tan, Guang
Gou, Chao
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (04) : 1242 - 1257
[9] Facial Expression Recognition Robust to Occlusion using Spatial Transformer Network with Triplet Loss Function
Kim, Jieun
Lee, Eung-Joo
Lee, Deokwoo
[J]. PATTERN RECOGNITION AND TRACKING XXXIII, 2022, 12101
[10] Facial expression recognition via a jointly-learned dual-branch network
Bordjiba, Yamina
Merouani, Hayet Farida
Azizi, Nabiha
[J]. INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2022, 13 (06) : 447 - 456

← 1 2 3 4 5 →