SCALE-Pose: Skeletal Correction and Language Knowledge-assisted for 3D Human Pose Estimation

被引：0

作者：

Ma, Xinnan ^{[1
]}

Li, Yaochen ^{[1
]}

Zhao, Limeng ^{[1
]}

Zhou, ChenXu ^{[1
]}

Xu, Yuncheng ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian 710049, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XI | 2025年 / 15041卷

关键词：

3D human pose estimation; Transformer; Priori knowledge; Skeletal correction; Large language model;

D O I：

10.1007/978-981-97-8795-1_39

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Transformer-based 3D human pose estimation methods typically use 2D joint sequences as inputs, leveraging spatial and temporal transformer encoders to model the 3D human pose. However, these methods often neglect to incorporate skeletal constraints to limit joint motion, and few consider integrating prior category knowledge to enhance potential joint representations. To solve these problems, we propose a new method named SCALE-Pose. Firstly, this method incorporates the spatial and temporal skeleton correction blocks to improve the ability of modeling the long-range dependency of the spatiotemporal motion of specific skeletons. Next, a four-stream radian loss based on skeleton angle error is introduced to constrain the motion space of joints. Finally, an auxiliary method employs global-local prompts from a large language model to generate prior category knowledge, improving the ability to generalize prior category knowledge. Experimental results on Human3.6M and MPI-INF-3DHP datasets demonstrate that our method outperforms existing approaches.

引用

页码：578 / 592

页数：15

共 50 条

[21] 3D Human Pose Estimation With Adversarial Learning
Meng, Wenming
Hu, Tao
Shuai, Li
2019 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV), 2019, : 93 - 99
[22] MONOCULAR 3D HUMAN POSE ESTIMATION BY CLASSIFICATION
Greif, Thomas
Lienhart, Rainer
Sengupta, Debabrata
2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
[23] 3D human pose estimation by depth map
Jianzhai Wu
Dewen Hu
Fengtao Xiang
Xingsheng Yuan
Jiongming Su
The Visual Computer, 2020, 36 : 1401 - 1410
[24] Pose ResNet: 3D Human Pose Estimation Based on Self-Supervision
Bao, Wenxia
Ma, Zhongyu
Liang, Dong
Yang, Xianjun
Niu, Tao
SENSORS, 2023, 23 (06)
[25] Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation
Fang, Hao-Shu
Xu, Yuanlu
Wang, Wenguan
Liu, Xiaobai
Zhu, Song-Chun
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 6821 - 6828
[26] Joint Camera Pose Estimation and 3D Human Pose Estimation in a Multi-camera Setup
Puwein, Jens
Ballan, Luca
Ziegler, Remo
Pollefeys, Marc
COMPUTER VISION - ACCV 2014, PT II, 2015, 9004 : 473 - 487
[27] Stabilization of 3D pose estimation
Neddermeyer, W
Schnell, M
Winkler, W
Lilienthal, A
APPLICATIONS OF GEOMETRIC ALGEBRA IN COMPUTER SCIENCE AND ENGINEERING, 2002, : 385 - 394
[28] Learning with privileged stereo knowledge for monocular absolute 3D human pose estimation
Bian, Cunling
Lu, Weigang
Feng, Wei
Wang, Song
PATTERN RECOGNITION LETTERS, 2025, 189 : 143 - 149
[29] Human Pose as Calibration Pattern; 3D Human Pose Estimation with Multiple Unsynchronized and Uncalibrated Cameras
Takahashi, Kosuke
Mikami, Dan
Isogawa, Mariko
Kimata, Hideaki
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1856 - 1863
[30] ActionPrompt: Action-Guided 3D Human Pose Estimation With Text and Pose Prompting
Zheng, Hongwei
Li, Han
Shi, Bowen
Dai, Wenrui
Wang, Botao
Sun, Yu
Guo, Min
Xiong, Hongkai
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2657 - 2662

← 1 2 3 4 5 →