Learning Multi-View Interactional Skeleton Graph for Action Recognition

被引:26
|
作者
Wang, Minsi [1 ]
Ni, Bingbing [2 ]
Yang, Xiaokang [1 ]
机构
[1] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Shanghai 200240, Peoples R China
基金
美国国家科学基金会;
关键词
Skeleton; Topology; Feature extraction; Convolution; Network topology; Recurrent neural networks; Action recognition; skeleton; multi-view; graph neural network; hierarchical method;
D O I
10.1109/TPAMI.2020.3032738
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Capturing the interactions of human articulations lies in the center of skeleton-based action recognition. Recent graph-based methods are inherently limited in the weak spatial context modeling capability due to fixed interaction pattern and inflexible shared weights of GCN. To address above problems, we propose the multi-view interactional graph network (MV-IGNet) which can construct, learn and infer multi-level spatial skeleton context, including view-level (global), group-level, joint-level (local) context, in a unified way. MV-IGNet leverages different skeleton topologies as multi-views to cooperatively generate complementary action features. For each view, separable parametric graph convolution (SPG-Conv) enables multiple parameterized graphs to enrich local interaction patterns, which provides strong graph-adaption ability to handle irregular skeleton topologies. We also partition the skeleton into several groups and then the higher-level group contexts including inter-group and intra-group, are hierarchically captured by above SPG-Conv layers. A simple yet effective global context adaption (GCA) module facilitates representative feature extraction by learning the input-dependent skeleton topologies. Compared to the mainstream works, MV-IGNet can be readily implemented while with smaller model size and faster inference. Experimental results show the proposed MV-IGNet achieves impressive performance on large-scale benchmarks: NTU-RGB+D and NTU-RGB+D 120.
引用
收藏
页码:6940 / 6954
页数:15
相关论文
共 50 条
  • [1] Multi-view representation learning for multi-view action recognition
    Hao, Tong
    Wu, Dan
    Wang, Qian
    Sun, Jin-Sheng
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 48 : 453 - 460
  • [2] Adaptive multi-view graph convolutional networks for skeleton-based action recognition
    Liu, Xing
    Li, Yanshan
    Xia, Rongjie
    [J]. NEUROCOMPUTING, 2021, 444 : 288 - 300
  • [3] Multi-View Action Recognition using Contrastive Learning
    Shah, Ketul
    Shah, Anshul
    Lau, Chun Pong
    de Melo, Celso M.
    Chellappa, Rama
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 3370 - 3380
  • [4] Regularized Multi-View Multi-Metric Learning for Action Recognition
    Wu, Xuqing
    Shah, Shishir K.
    [J]. 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 471 - 476
  • [5] Neural representation and learning for multi-view human action recognition
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    [J]. 2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [6] Multi-View Action Recognition by Cross-domain Learning
    Nie, Weizhi
    Liu, Anan
    Yu, Jing
    Su, Yuting
    Chaisorn, Lekha
    Wang, Yongkang
    Kankanhalli, Mohan S.
    [J]. 2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,
  • [7] Jointly Learning Multi-view Features for Human Action Recognition
    Wang, Ruoshi
    Liu, Zhigang
    Yin, Ziyang
    [J]. PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 4858 - 4861
  • [8] Discriminative Multi-View Subspace Feature Learning for Action Recognition
    Sheng, Biyun
    Li, Jun
    Xiao, Fu
    Li, Qun
    Yang, Wankou
    Han, Junwei
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (12) : 4591 - 4600
  • [9] Multi-view Regularized Extreme Learning Machine for Human Action Recognition
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    [J]. ARTIFICIAL INTELLIGENCE: METHODS AND APPLICATIONS, 2014, 8445 : 84 - 94
  • [10] Affective Action and Interaction Recognition by Multi-View Representation Learning from Handcrafted Low-Level Skeleton Features
    Avola, Danilo
    Cascio, Marco
    Cinque, Luigi
    Fagioli, Alessio
    Foresti, Gian Luca
    [J]. INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2022, 32 (10)