PoseGTAC: Graph Transformer Encoder-Decoder with Atrous Convolution for 3D Human Pose Estimation

被引:0
|
作者
Zhu, Yiran [1 ]
Xu, Xing [1 ]
Shen, Fumin [1 ]
Ji, Yanli [1 ]
Gao, Lianli [1 ]
Shen, Heng Tao [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph neural networks (GNNs) have been widely used in the 3D human pose estimation task, since the pose representation of a human body can be naturally modeled by the graph structure. Generally, most of the existing GNN-based models utilize the restricted receptive fields of filters and single-scale information, while neglecting the valuable multiscale contextual information. To tackle this issue, we propose a novel model named Graph Transformer Encoder-Decoder with Atrous Convolution (PoseGTAC), to effectively extract multi-scale context and long-range information. Specifically, our PoseGTAC model has two key components: Graph Atrous Convolution (GAC) and Graph Transformer Layer (GTL), which are respectively for the extraction of local multi-scale and global long-range information. They are combined and stacked in an encoder-decoder structure, where graph pooling and unpooling are adopted for the interaction of multi-scale information from local to global aspect (e.g., part-scale and body-scale). Extensive experiments on the Human3.6M and MPI-INF-3DHP datasets demonstrate that the proposed PoseGTAC model achieves state-of-the-art performance.
引用
收藏
页码:1359 / 1365
页数:7
相关论文
共 50 条
  • [1] PointAtrousGraph: Deep Hierarchical Encoder-Decoder with Point Atrous Convolution for Unorganized 3D Points
    Pan, Liang
    Chew, Chee-Meng
    Lee, Gim Hee
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 1113 - 1120
  • [2] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation
    Wan, Ziniu
    Li, Zhengjia
    Tian, Maoqing
    Liu, Jianbo
    Yi, Shuai
    Li, Hongsheng
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13013 - 13022
  • [3] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [4] Atrous spatial pyramid convolution for object detection with encoder-decoder
    Jie, Feiran
    Nie, Qingfeng
    Li, Mingsuo
    Yin, Ming
    Jin, Taisong
    NEUROCOMPUTING, 2021, 464 : 107 - 118
  • [5] SCGFormer: Semantic Chebyshev Graph Convolution Transformer for 3D Human Pose Estimation
    Liang, Jiayao
    Yin, Mengxiao
    APPLIED SCIENCES-BASEL, 2024, 14 (04):
  • [6] HOGFormer: high-order graph convolution transformer for 3D human pose estimation
    Xie, Yuhong
    Hong, Chaoqun
    Zhuang, Weiwei
    Liu, Lijuan
    Li, Jie
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, : 599 - 610
  • [7] Conditional Directed Graph Convolution for 3D Human Pose Estimation
    Hu, Wenbo
    Zhang, Changgong
    Zhan, Fangneng
    Zhang, Lei
    Wong, Tien-Tsin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 602 - 611
  • [8] DGFormer: Dynamic graph transformer for 3D human pose estimation
    Chen Z.
    Dai J.
    Bai J.
    Pan J.
    Pattern Recognition, 2024, 152
  • [9] Encoder-Decoder Architecture for 3D Seismic Inversion
    Gelboim, Maayan
    Adler, Amir
    Sun, Yen
    Araya-Polo, Mauricio
    SENSORS, 2023, 23 (01)
  • [10] Constrained Image Splicing Detection and Localization With Attention-Aware Encoder-Decoder and Atrous Convolution
    Liu, Yaqi
    Zhao, Xianfeng
    IEEE ACCESS, 2020, 8 : 6729 - 6741