A multi-scale multi-attention network for dynamic facial expression recognition

被引：0

作者：

Xia, Xiaohan ^{[1
]}

Yang, Le ^{[1
]}

Wei, Xiaoyong ^{[2
,3
]}

Sahli, Hichem ^{[4
,5
]}

Jiang, Dongmei ^{[1
,3
]}

机构：

[1] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big, Shaanxi Key Lab Speech & Image Informat Proc, Youyi Xilu 127, Xian 710072, Peoples R China

[2] Sichuan Univ, Sch Comp Sci, Chengdu 610065, Peoples R China

[3] Peng Cheng Lab, Vanke Cloud City Phase 1,Bldg 8,Xili St, Shenzhen 518055, Guangdong, Peoples R China

[4] Vrije Univ Brussel VUB, Dept Elect & Informat ETRO, Pl Laan 2, B-1050 Brussels, Belgium

[5] Interunivers Microelect Ctr IMEC, Kapeldreef 75, B-3001 Heverlee, Belgium

来源：

MULTIMEDIA SYSTEMS | 2022年 / 28卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Facial expression recognition; Multi-scale multi-attention network (MSMA-Net); Spatial attention; Temporal attention; MODEL;

D O I：

10.1007/s00530-021-00849-8

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Characterizing spatial information and modelling temporal dynamics of facial images are key challenges for dynamic facial expression recognition (FER). In this paper, we propose an end-to-end multi-scale multi-attention network (MSMA-Net) for dynamic FER. In our model, the spatio-temporal features are encoded at two scales, i.e. the entire face and local facial patches. For each scale, we adopt a 2D convolutional neural network (CNN) to capture frame-based spatial information, and a 3D CNN to depict the short-term dynamics in the temporal sequence. Moreover, we propose a multi-attention mechanism by considering both spatial and temporal attention models. The temporal attention is applied on the image sequence to highlight expressive frames within the whole sequence, and the spatial attention mechanism is applied at the patch level to learn salient facial features. Comprehensive experiments on publicly available datasets (Aff-Wild2, RML, and AFEW) show that the proposed MSMA-Net model automatically highlights salient expressive frames, within which salient facial features are learned, allowing better or very competitive results compared to state-of-the-art methods.

引用

页码：479 / 493

页数：15

共 50 条

[1] A multi-scale multi-attention network for dynamic facial expression recognition
Xiaohan Xia
Le Yang
Xiaoyong Wei
Hichem Sahli
Dongmei Jiang
[J]. Multimedia Systems, 2022, 28 : 479 - 493
[2] Multi-Scale Attention Learning Network for Facial Expression Recognition
Dong, Qian
Ren, Weihong
Gao, Yu
Jiang, Weibo
Liu, Honghai
[J]. IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1732 - 1736
[3] Multi-Scale Integrated Attention Mechanism for Facial Expression Recognition Network
Luo, Sishi
Li, Maojun
Chen, Man
[J]. Computer Engineering and Applications, 2023, 59 (01) : 199 - 206
[4] Novel multi-scale deep residual attention network for facial expression recognition
Liu, Dong
Wang, Lifeng
Wang, Zhiyong
Chen, Longxi
[J]. JOURNAL OF ENGINEERING-JOE, 2020, 2020 (12): : 1220 - 1226
[5] Multi-scale multi-attention network for diabetic retinopathy grading
Xia, Haiying
Long, Jie
Song, Shuxiang
Tan, Yumei
[J]. PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (01):
[6] A Multi-scale and Multi-attention Network for Skin Lesion Segmentation
Wu, Cong
Zhang, Hang
Chen, Dingsheng
Gan, Haitao
[J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT IV, 2024, 14450 : 537 - 550
[7] Multi-Attention Module for Dynamic Facial Emotion Recognition
Zhi, Junnan
Song, Tingting
Yu, Kang
Yuan, Fengen
Wang, Huaqiang
Hu, Guangyang
Yang, Hao
[J]. INFORMATION, 2022, 13 (05)
[8] Multi-scale fusion visual attention network for facial micro-expression recognition
Pan, Hang
Yang, Hongling
Xie, Lun
Wang, Zhiliang
[J]. FRONTIERS IN NEUROSCIENCE, 2023, 17
[9] A Novel Facial Expression Recognition (FER) Model Using Multi-scale Attention Network
Ghadai, Chakrapani
Patra, Dipti
Okade, Manish
[J]. COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT II, 2024, 2010 : 336 - 346
[10] Multi-scale Multi-attention Network for Moire Document Image Binarization
Guo, Yanqing
Ji, Caijuan
Zheng, Xin
Wang, Qianyu
Luo, Xiangyang
[J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 90

← 1 2 3 4 5 →