Facial expression recognition in videos using hybrid CNN & ConvLSTM

被引：12

作者：

Singh R. ^{[1
]}

Saurav S. ^{[2
]}

Kumar T. ^{[3
]}

Saini R. ^{[2
]}

Vohra A. ^{[1
]}

Singh S. ^{[2
]}

机构：

[1] Department of Electronic Science, Kurukshetra University, Kurukshetra

[2] CSIR-Central Electronics Engineering Research Institute, Pilani

[3] Department of Computer Science, Birla-Institute of Technology and Science, Pilani

来源：

International Journal of Information Technology | 2023年 / 15卷 / 4期

关键词：

3D convolutional neural networks (3D-CNN); Convolutional LSTM (ConvLSTM); Long short-term memory (LSTM); Video-based facial expression recognition (VFER);

D O I：

10.1007/s41870-023-01183-0

中图分类号：

学科分类号：

摘要：

The three-dimensional convolutional neural network (3D-CNN) and long short-term memory (LSTM) have consistently outperformed many approaches in video-based facial expression recognition (VFER). The image is unrolled to a one-dimensional vector by the vanilla version of the fully-connected LSTM (FC-LSTM), which leads to the loss of crucial spatial information. Convolutional LSTM (ConvLSTM) overcomes this limitation by performing LSTM operations in convolutions without unrolling, thus retaining useful spatial information. Motivated by this, in this paper, we propose a neural network architecture that consists of a blend of 3D-CNN and ConvLSTM for VFER. The proposed hybrid architecture captures spatiotemporal information from the video sequences of emotions and attains competitive accuracy on three FER datasets open to the public, namely the SAVEE, CK + , and AFEW. The experimental results demonstrate excellent performance without external emotional data with the added advantage of having a simple model with fewer parameters. Moreover, unlike the state-of-the-art deep learning models, our designed FER pipeline improves execution speed by many factors while achieving competitive recognition accuracy. Hence, the proposed FER pipeline is an appropriate candidate for recognizing facial expressions on resource-limited embedded platforms for real-time applications. © 2023, The Author(s), under exclusive licence to Bharati Vidyapeeth's Institute of Computer Applications and Management.

引用

页码：1819 / 1830

页数：11

共 50 条

[31] A Hybrid Approach for Facial Expression Recognition
Puthanidam, Roshni Velluva
Moh, Teng-Sheng
[J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2018), 2018,
[32] Facial expression recognition from near-infrared videos
Zhao, Guoying
Huang, Xiaohua
Taini, Matti
Li, Stan Z.
Pietikainen, Matti
[J]. IMAGE AND VISION COMPUTING, 2011, 29 (09) : 607 - 619
[33] Facial expression recognition based on adaptation of the classifier to videos of the user
Churaev, E. N.
Savchenko, A. V.
[J]. COMPUTER OPTICS, 2023, 47 (05) : 806 - 815
[34] Integrating Facial Expression and Body Gesture in Videos for Emotion Recognition
Yan, Jingjie
Zheng, Wenming
Xin, Minhai
Yan, Jingwei
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (03): : 610 - 613
[35] Emotion Recognition from Videos Using Facial Expressions
Selvi, P. Tamil
Vyshnavi, P.
Jagadish, R.
Srikumar, Shravan
Veni, S.
[J]. ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 565 - 576
[36] Facial Pain Expression Recognition in Real-Time Videos
Dutta, Pranti
Nacham, M.
[J]. JOURNAL OF HEALTHCARE ENGINEERING, 2018, 2018
[37] Personalized Movie Summarization Using Deep CNN-Assisted Facial Expression Recognition
Ul Haq, Ijaz
Ullah, Amin
Muhammad, Khan
Lee, Mi Young
Baik, Sung Wook
[J]. COMPLEXITY, 2019, 2019
[38] 3D Facial Expression Recognition Using Deep Feature Fusion CNN
Tian, Kun
Yin, Qian
Zeng, Liaoyuan
Wang, Wenyi
McGrath, Sean
[J]. 2019 30TH IRISH SIGNALS AND SYSTEMS CONFERENCE (ISSC), 2019,
[39] Hybrid heuristic mechanism for occlusion aware facial expression recognition scheme using patch based adaptive CNN with attention mechanism
Prasad, A. Reddy
Rajesh, A.
[J]. INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2023, 17 (03): : 773 - 797
[40] Violence Detection in Videos based on CNN feature for ConvLSTM2D
Trinh, Tan Dat
Sang, Vu Ngoc Thanh
Thuy, Le Nhi Lam
Le, Duy-Dong
Nguyen, Thai-Binh
Bao, Pham The
[J]. PROCEEDINGS OF THE 5TH ACM WORKSHOP ON INTELLIGENT CROSS-DATA ANALYSIS AND RETRIEVAL, ICDAR 2024, 2024, : 33 - 36

← 1 2 3 4 5 →