Pose estimation at night in infrared images using a lightweight multi-stage attention network

被引:15
|
作者
Zang, Ying [1 ,2 ,3 ]
Fan, Chunpeng [5 ]
Zheng, Zeyu [1 ,4 ,5 ]
Yang, Dongsheng [1 ,2 ]
机构
[1] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[2] Chinese Acad Sci, Shenyang Inst Comp Technol, Shenyang 110168, Peoples R China
[3] Huzhou Univ, Sch Informat Engn, Huzhou 313000, Peoples R China
[4] Chinese Acad Sci, Shenyang Inst Automat, Shenyang 110168, Peoples R China
[5] Hangzhou PingxingShijie Co Ltd, Hangzhou 311203, Peoples R China
关键词
Pose estimation; Far-infrared image; LMANet; Spatial attention; Channel attention;
D O I
10.1007/s11760-021-01916-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human Keypoints Detection is a relatively basic task in computer vision; it is the pre-task of human action recognition, behavior analysis and human-computer interaction. Since most abnormal actions occur at night, how to effectively extract skeleton sequence data in a low-light or completely dark environment poses a huge challenge for its identification. This paper proposes to use far infrared images to detection key points of the human body, which can solve the problem of human pose estimation under challenging weather conditions such as total darkness, smoke, inclement weather and glare. However, far-infrared images have some shortcomings, such as low resolution, noise and thermal characteristics; the skeleton data need to be provided in real time for the next stage of task. Based on the above reasons, this paper proposes a lightweight multi-stage attention network (LMANet) to detect the key points of human at night. This new network structure adds context information through the large receptive field, which helps to assist the detection of neighboring key points through this information, but for the sake of lightweight consideration, this article only extends the network to two stages. In addition, this article uses the attention module to effectively select channels with a large amount of information and highlight the features of key points, while eliminating background interference. In order to detect key points of the human in various complex environments, we use techniques such as difficult sample mining which improves the accuracy of key points with low confidence. Our network has been verified on two visible light datasets, fully demonstrating excellent performance. This paper successfully introduces far-infrared images into the field of pose estimation, because there is no public dataset for far-infrared pose estimation. In this paper, 700 images are selected for annotation from multiple public far-infrared object detection, segmentation and action recognition datasets; our algorithm is verified on this dataset; the effect is very good. After the paper is published, we will publish our key points of the human body annotated documents.
引用
收藏
页码:1757 / 1765
页数:9
相关论文
共 50 条
  • [31] AES-Net: An adapter and enhanced self-attention guided network for multi-stage glaucoma classification using fundus images
    Das, Dipankar
    Nayak, Deepak Ranjan
    Pachori, Ram Bilas
    IMAGE AND VISION COMPUTING, 2024, 146
  • [32] ESTIMATION IN MULTI-STAGE SURVEYS
    SCOTT, A
    SMITH, TMF
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1969, 64 (327) : 830 - &
  • [33] Lightweight head pose estimation without keypoints based on multi-scale lightweight neural network
    Xiaolei Chen
    Yubing Lu
    Baoning Cao
    Dongmei Lin
    Ishfaq Ahmad
    The Visual Computer, 2023, 39 (6) : 2455 - 2469
  • [34] 3D human pose estimation in motion based on multi-stage regression
    Zhang, Yongtao
    Li, Shuang
    Long, Peng
    DISPLAYS, 2021, 69
  • [35] Lightweight head pose estimation without keypoints based on multi-scale lightweight neural network
    Chen, Xiaolei
    Lu, Yubing
    Cao, Baoning
    Lin, Dongmei
    Ahmad, Ishfaq
    VISUAL COMPUTER, 2023, 39 (06): : 2455 - 2469
  • [36] Multi-Person Pose Estimation Using Thermal Images
    Chen, I-Chien
    Wang, Chang-Jen
    Wen, Chao-Kai
    Tzou, Shiow-Jyu
    IEEE ACCESS, 2020, 8 : 174964 - 174971
  • [37] Multi-Stage Generation of Tile Images Based on Generative Adversarial Network
    Lu, Jianfeng
    Shi, Mengtao
    Lu, Yuhang
    Chang, Ching-Chun
    Li, Li
    Bai, Rui
    IEEE ACCESS, 2022, 10 : 127502 - 127513
  • [38] Lightweight and Efficient Human Pose Estimation Fusing Transformer and Attention
    Wu, Chengpeng
    Tan, Guangxing
    Chen, Haifeng
    Li, Chunyu
    Computer Engineering and Applications, 2024, 60 (22) : 197 - 208
  • [39] Lightweight cattle pose estimation with fusion of reparameterization and an attention mechanism
    Zhao, Enming
    Chen, Bobo
    Zhao, Hongyi
    Liu, Guangyu
    Jiang, Jianbo
    Li, Yanpeng
    Zhang, Jilei
    Luo, Chuang
    PLOS ONE, 2024, 19 (08):
  • [40] Multi-stage Transfer Learning Based Yoga Pose Recognition Using CNN
    Pradeep, Chakka Sai
    Sinha, Neelam
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2021, 2024, 13102 : 151 - 159