Frame-Level Label Refinement for Skeleton-Based Weakly-Supervised Action Recognition

被引:0
|
作者
Yu, Qing [1 ]
Fujiwara, Kent [2 ]
机构
[1] Univ Tokyo, Tokyo, Japan
[2] LINE Corp, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, skeleton-based action recognition has achieved remarkable performance in understanding human motion from sequences of skeleton data, which is an important medium for synthesizing realistic human movement in various applications. However, existing methods assume that each action clip is manually trimmed to contain one specific action, which requires a significant amount of effort for an-notation. To solve this problem, we consider a novel problem of skeleton-based weakly-supervised temporal action localization (S-WTAL), where we need to recognize and localize human action segments in untrimmed skeleton videos given only the video-level labels. Although this task is challenging due to the sparsity of skeleton data and the lack of contextual clues from interaction with other objects and the environment, we present a frame-level label refinement frame-work based on a spatio-temporal graph convolutional network (ST-GCN) to overcome these difficulties. We use multiple instance learning (MIL) with video-level labels to generate the frame-level predictions. Inspired by advances in handling the noisy label problem, we introduce a label cleaning strategy of the frame-level pseudo labels to guide the learning pro-cess. The network parameters and the frame-level predictions are alternately updated to obtain the final results. We extensively evaluate the effectiveness of our learning approach on skeleton-based action recognition benchmarks. The state-of-the-art experimental results demonstrate that the proposed method can recognize and localize action segments of the skeleton data.
引用
收藏
页码:3322 / 3330
页数:9
相关论文
共 50 条
  • [21] Weakly-supervised action localization based on seed superpixels
    Ullah, Sami
    Bhatti, Naeem
    Qasim, Tehreem
    Hassan, Najmul
    Zia, Muhammad
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (04) : 6203 - 6220
  • [22] Scale-Aware Graph Convolutional Network With Part-Level Refinement for Skeleton-Based Human Action Recognition
    Li, Chang
    Mao, Yingchi
    Huang, Qian
    Zhu, Xiaowei
    Wu, Jie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) : 4311 - 4324
  • [23] Part-Level Graph Convolutional Network for Skeleton-Based Action Recognition
    Huang, Linjiang
    Huang, Yan
    Ouyang, Wanli
    Wang, Liang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11045 - 11052
  • [24] Weakly-Supervised Multi-Person Action Recognition in 360° Videos
    Li, Junnan
    Liu, Jianquan
    Wang, Yongkang
    Nishimura, Shoji
    Kankanhalli, Mohan S.
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 497 - 505
  • [25] Multi-label Discriminative Weakly-Supervised Human Activity Recognition and Localization
    Mosabbeb, Ehsan Adeli
    Cabral, Ricardo
    De la Torre, Fernando
    Fathy, Mahmood
    COMPUTER VISION - ACCV 2014, PT V, 2015, 9007 : 241 - 258
  • [26] Cross-Scale Spatial Refinement Graph Convolutional Network for Skeleton-Based Action Recognition
    Chengyuan Ke
    Sheng Liu
    Zhenghao Ke
    Yuan Feng
    Shengyong Chen
    International Journal of Computational Intelligence Systems, 18 (1)
  • [27] Generative Action Description Prompts for Skeleton-based Action Recognition
    Xiang, Wangmeng
    Li, Chao
    Zhou, Yuxuan
    Wang, Biao
    Zhang, Lei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 10242 - 10251
  • [28] Prompt-supervised dynamic attention graph convolutional network for skeleton-based action recognition
    Zhu, Shasha
    Sun, Lu
    Ma, Zeyuan
    Li, Chenxi
    He, Dongzhi
    NEUROCOMPUTING, 2025, 611
  • [29] Contrast-Reconstruction Representation Learning for Self-Supervised Skeleton-Based Action Recognition
    Wang, Peng
    Wen, Jun
    Si, Chenyang
    Qian, Yuntao
    Wang, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6224 - 6238
  • [30] Cross-stream contrastive learning for self-supervised skeleton-based action recognition
    Li, Ding
    Tang, Yongqiang
    Zhang, Zhizhong
    Zhang, Wensheng
    IMAGE AND VISION COMPUTING, 2023, 135