Interactive Spatiotemporal Token Attention Network for Skeleton-based General Interactive Action Recognition

被引：3

作者：

Wen, Yuhang ^{[1
]}

Tang, Zixuan ^{[1
]}

Pang, Yunsheng ^{[2
]}

Ding, Beichen ^{[1
]}

Liu, Mengyuan ^{[3
]}

机构：

[1] Sun Yat Sen Univ, Shenzhen 518107, Peoples R China

[2] Tencent Technol Shenzhen Co Ltd, Shenzhen, Peoples R China

[3] Peking Univ, Shenzhen Grad Sch, Key Lab Machine Percept, Shenzhen, Peoples R China

来源：

2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2023年

基金：

中国国家自然科学基金;

关键词：

DATASET;

D O I：

10.1109/IROS55552.2023.10342472

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recognizing interactive action plays an important role in human-robot interaction and collaboration. Previous methods use late fusion and co-attention mechanism to capture interactive relations, which have limited learning capability or inefficiency to adapt to more interacting entities. With assumption that priors of each entity are already known, they also lack evaluations on a more general setting addressing the diversity of subjects. To address these problems, we propose an Interactive Spatiotemporal Token Attention Network (ISTA-Net), which simultaneously model spatial, temporal, and interactive relations. Specifically, our network contains a tokenizer to partition Interactive Spatiotemporal Tokens (ISTs), which is a unified way to represent motions of multiple diverse entities. By extending the entity dimension, ISTs provide better interactive representations. To jointly learn along three dimensions in ISTs, multi-head self-attention blocks integrated with 3D convolutions are designed to capture inter-token correlations. When modeling correlations, a strict entity ordering is usually irrelevant for recognizing interactive actions. To this end, Entity Rearrangement is proposed to eliminate the orderliness in ISTs for interchangeable entities. Extensive experiments on four datasets verify the effectiveness of ISTA-Net by outperforming state-of-the-art methods. Our code is publicly available at https://github.com/Necolizer/ISTA-Net.

引用

页码：7886 / 7892

页数：7

共 50 条

[41] Adaptive Spatiotemporal Representation Learning for Skeleton-Based Human Action Recognition
Yu, Jiahui
Gao, Hongwei
Chen, Yongquan
Zhou, Dalin
Liu, Jinguo
Ju, Zhaojie
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (04) : 1654 - 1665
[42] SKELETON-BASED INTERACTIVE GRAPH NETWORK FOR HUMAN OBJECT INTERACTION DETECTION
Zheng, Sipeng
Chen, Shizhe
Jin, Qin
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
[43] Skeleton-Based Attention Mask for Pedestrian Attribute Recognition Network
Sooksatra, Sorn
Rujikietgumjorn, Sitapa
JOURNAL OF IMAGING, 2021, 7 (12)
[44] Prompt-supervised dynamic attention graph convolutional network for skeleton-based action recognition
He, Dongzhi (victor@bjut.edu.cn), 2025, 611
[45] Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition
Shu, Yang
Li, Wanggen
Li, Doudou
Gao, Kun
Jie, Biao
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 16 - 28
[46] Two Stream Multi-Attention Graph Convolutional Network for Skeleton-Based Action Recognition
Zhou, Huijian
Tian, Zhiqiang
Du, Shaoyi
ARTIFICIAL INTELLIGENCE AND ROBOTICS, ISAIR 2023, 2024, 1998 : 112 - 120
[47] Relation-mining self-attention network for skeleton-based human action recognition
Gedamu, Kumie
Ji, Yanli
Gao, LingLing
Yang, Yang
Shen, Heng Tao
PATTERN RECOGNITION, 2023, 139
[48] Spatio-temporal segments attention for skeleton-based action recognition
Qiu, Helei
Hou, Biao
Ren, Bo
Zhang, Xiaohua
NEUROCOMPUTING, 2023, 518 : 30 - 38
[49] Multi-Term Attention Networks for Skeleton-Based Action Recognition
Diao, Xiaolei
Li, Xiaoqiang
Huang, Chen
APPLIED SCIENCES-BASEL, 2020, 10 (15):
[50] Skeleton-Based Action Recognition with Shift Graph Convolutional Network
Cheng, Ke
Zhang, Yifan
He, Xiangyu
Chen, Weihan
Cheng, Jian
Lu, Hanqing
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 180 - 189

← 1 2 3 4 5 →