MAPoseNet: Animal pose estimation network via multi-scale convolutional attention

被引:0
|
作者
Liu, Sicong [1 ]
Fan, Qingcheng [1 ]
Li, Shuqin [1 ]
Zhao, Chunjiang [1 ,2 ]
机构
[1] Northwest A&F Univ, Coll Informat Engn, 3 Taicheng Rd, Yangling 712100, Peoples R China
[2] Beijing Acad Agr & Forestry Sci, Res Ctr Informat Technol, Beijing 100097, Peoples R China
关键词
Animal pose estimation; Attention mechanism; Asymmetric convolution; Feature pyramid; IDENTIFICATION;
D O I
10.1016/j.jvcir.2023.103989
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Animal pose estimation serves as an upstream task for recognizing and understanding animal behavior. Over the last year, the accuracy of the deep learning-based method has steadily improved, but at the expense of the model's inference speed. This paper uses an efficient and powerful model to improve inference speed and accuracy. The classic encoder-decoder architecture is chosen. For estimating animal pose, our model based on a feature pyramid and a multi-scale asymmetric convolution attention mechanism is developed and named MAPoseNet (Animal Pose Estimation Network Via Multi-scale Convolutional Attention). MAPoseNet consists of an encoder and a decoder. Rather than typical self-attention, the encoder's attention mechanism comprises multi-scale, asymmetric convolutions that are lightweight and instrumental in improving inference speed. A feature pyramid and a feature balance module make up the decoder. The public dataset AP-10K is used to train and test MAPoseNet. A series of experimental results demonstrate that the MAPoseNet model provides cutting-edge performance. MAPoseNet outperforms HRFormer by 1.3 AP and 0.8 AR, with 33.7% fewer FLOPs and 66% faster inference speed. And our model surpasses HRNet and HRFormer on the Animal Pose dataset as well. Our model has achieved a win-win situation regarding inference speed and accuracy.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Multi-scale Dilated Convolutional Neural Network Model Based on Attention Mechanism
    Wang, Jingbin
    Lai, Xiaolian
    Lei, Jing
    Zhang, Jingxuan
    [J]. Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (06): : 497 - 508
  • [32] Instance segmentation convolutional neural network based on multi-scale attention mechanism
    Wang Gaihua
    Lin Jinheng
    Cheng Lei
    Dai Yingying
    Zhang Tianlun
    [J]. PLOS ONE, 2022, 17 (01):
  • [33] Skin Lesion Segmentation Based on Multi-Scale Attention Convolutional Neural Network
    Jiang, Yun
    Cao, Simin
    Tao, Shengxin
    Zhang, Hai
    [J]. IEEE ACCESS, 2020, 8 : 122811 - 122825
  • [34] MULTI-SCALE TEMPORAL FREQUENCY CONVOLUTIONAL NETWORK WITH AXIAL ATTENTION FOR SPEECH ENHANCEMENT
    Zhang, Guochang
    Yu, Libiao
    Wang, Chunliang
    Wei, Jianqiang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9122 - 9126
  • [35] Selective Learning of Human Pose Estimation Based on Multi-Scale Convergence Network
    Liu, Wenkai
    Qin, Cuizhu
    Wu, Menglong
    Bai, Wenle
    Dong, Hongxia
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (05) : 1081 - 1084
  • [36] DDoS Attack Detection via Multi-Scale Convolutional Neural Network
    Cheng, Jieren
    Liu, Yifu
    Tang, Xiangyan
    Sheng, Victor S.
    Li, Mengyang
    Li, Junqi
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 62 (03): : 1317 - 1333
  • [37] Crowd Counting via Residual Multi-scale Convolutional Neural Network
    Lu, Jingang
    Zhang, Li
    [J]. 2019 SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2019, : 315 - 320
  • [38] Multi-scale Residual Pyramid Attention Network for Monocular Depth Estimation
    Liu, Jing
    Zhang, Xiaona
    Li, Zhaoxin
    Mao, Tianlu
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5137 - 5144
  • [39] Multi-scale and multi-column convolutional neural network for crowd density estimation
    Lei Chen
    Guodong Wang
    Guojia Hou
    [J]. Multimedia Tools and Applications, 2021, 80 : 6661 - 6674
  • [40] Multi-scale and multi-column convolutional neural network for crowd density estimation
    Chen, Lei
    Wang, Guodong
    Hou, Guojia
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (05) : 6661 - 6674