Research on Heart Rate Detection from Facial Videos Based on an Attention Mechanism 3D Convolutional Neural Network

被引：0

作者：

Sun, Xiujuan ^{[1
]}

Su, Ying ^{[1
]}

Hou, Xiankai ^{[1
]}

Yuan, Xiaolan ^{[1
]}

Li, Hongxue ^{[1
]}

Wang, Chuanjiang ^{[1
]}

机构：

[1] Shandong Univ Sci & Technol, Coll Elect Engn & Automat, Qingdao 266590, Peoples R China

来源：

ELECTRONICS | 2025年 / 14卷 / 02期

关键词：

BiLSTM; attention mechanism; convolutional neural network; facial video; rPPG;

D O I：

10.3390/electronics14020269

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Remote photoplethysmography (rPPG) has attracted growing attention due to its non-contact nature. However, existing non-contact heart rate detection methods are often affected by noise from motion artifacts and changes in lighting, which can lead to a decrease in detection accuracy. To solve this problem, this paper initially employs manual extraction to precisely define the facial Region of Interest (ROI), expanding the facial area while avoiding rigid regions such as the eyes and mouth to minimize the impact of motion artifacts. Additionally, during the training phase, illumination normalization is employed on video frames with uneven lighting to mitigate noise caused by lighting fluctuations. Finally, this paper introduces a 3D convolutional neural network (CNN) method incorporating an attention mechanism for heart rate detection from facial videos. We optimize the traditional 3D-CNN to capture global features in spatiotemporal data more effectively. The SimAM attention mechanism is introduced to enable the model to precisely focus on and enhance facial ROI feature representations. Following the extraction of rPPG signals, a heart rate estimation network using a bidirectional long short-term memory (BiLSTM) model is employed to derive the heart rate from the signals. The method introduced here is experimentally validated on two publicly available datasets, UBFC-rPPG and PURE. The mean absolute errors were 0.24 bpm and 0.65 bpm, the root mean square errors were 0.63 bpm and 1.30 bpm, and the Pearson correlation coefficients reached 0.99, confirming the method's reliability. Comparisons of predicted signals with ground truth signals further validated its accuracy.

引用

页数：17

共 50 条

[31] 3D mineral prospectivity modeling in the Sanshandao goldfield, China using the convolutional neural network with attention mechanism
Liu, Zhankun
Yu, Shuyan
Deng, Hao
Jiang, Guipeng
Wang, Rongchao
Yang, Xiaoqi
Song, Jiaxuan
Chen, Jin
Mao, Xiancheng
ORE GEOLOGY REVIEWS, 2024, 164
[32] 3D Face Reconstruction Based on Convolutional Neural Network
Li Fangmin
Chen Ke
Liu Xinhua
2017 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION (ICICTA 2017), 2017, : 71 - 74
[33] Enhanced 3D Human Pose Estimation from Videos by Using Attention-Based Neural Network with Dilated Convolutions
Liu, Ruixu
Shen, Ju
Wang, He
Chen, Chen
Cheung, Sen-ching
Asari, Vijayan K.
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (05) : 1596 - 1615
[34] Enhanced 3D Human Pose Estimation from Videos by Using Attention-Based Neural Network with Dilated Convolutions
Ruixu Liu
Ju Shen
He Wang
Chen Chen
Sen-ching Cheung
Vijayan K. Asari
International Journal of Computer Vision, 2021, 129 : 1596 - 1615
[35] Spatial Attention-Based 3D Graph Convolutional Neural Network for Sign Language Recognition
Al-Hammadi, Muneer
Bencherif, Mohamed A.
Alsulaiman, Mansour
Muhammad, Ghulam
Mekhtiche, Mohamed Amine
Abdul, Wadood
Alohali, Yousef A.
Alrayes, Tareq S.
Mathkour, Hassan
Faisal, Mohammed
Algabri, Mohammed
Altaheri, Hamdi
Alfakih, Taha
Ghaleb, Hamid
SENSORS, 2022, 22 (12)
[36] Attention-based 3D convolutional recurrent neural network model for multimodal emotion recognition
Du, Yiming
Li, Penghai
Cheng, Longlong
Zhang, Xuanwei
Li, Mingji
Li, Fengzhou
FRONTIERS IN NEUROSCIENCE, 2024, 17
[37] A multi-channel convolutional neural network based on attention mechanism fusion for facial expression recognition
Zhu, Muqing
Wen, Mi
APPLIED MATHEMATICS AND NONLINEAR SCIENCES, 2023, 9 (01)
[38] Double Channel 3D Convolutional Neural Network for Exam Scene Classification of Invigilation Videos
Song, Wu
Yu, Xinguo
IMAGE AND VIDEO TECHNOLOGY (PSIVT 2019), 2019, 11854 : 116 - 127
[39] A Visible Light 3D Positioning System for Underground Mines Based on Convolutional Neural Network Combining Inception Module and Attention Mechanism
Deng, Bo
Wang, Fengying
Qin, Ling
Hu, Xiaoli
PHOTONICS, 2023, 10 (08)
[40] Violence Detection in Videos by Combining 3D Convolutional Neural Networks and Support Vector Machines
Accattoli, Simone
Sernani, Paolo
Falcionelli, Nicola
Mekuria, Dagmawi Neway
Dragoni, Aldo Franco
APPLIED ARTIFICIAL INTELLIGENCE, 2020, 34 (04) : 329 - 344

← 1 2 3 4 5 →