Weakly Supervised Action Recognition and Localization Using Web Images

被引:0
|
作者
Liu, Cuiwei [1 ]
Wu, Xinxiao [1 ]
Jia, Yunde [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing Lab Intelligent Informat Technol, Beijing 100081, Peoples R China
来源
关键词
EVENT RECOGNITION; VIDEOS;
D O I
10.1007/978-3-319-16814-2_42
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the problem of joint recognition and localization of actions in videos. We develop a novel Transfer Latent Support Vector Machine (TLSVM) by using Web images and weakly annotated training videos. In order to alleviate the laborious and time-consuming manual annotations of action locations, the model takes training videos which are only annotated with action labels as input. Due to the non-available ground-truth of action locations in videos, the locations are treated as latent variables in our method and are inferred during both training and testing phrases. For the purpose of improving the localization accuracy with some prior information of action locations, we collect a number of Web images which are annotated with both action labels and action locations to learn a discriminative model by enforcing the local similarities between videos and Web images. A structural transformation based on randomized clustering forest is used to map Web images to videos for handling the heterogeneous features of Web images and videos. Experiments on two publicly available action datasets demonstrate that the proposed model is effective for both action localization and action recognition.
引用
收藏
页码:642 / 657
页数:16
相关论文
共 50 条
  • [1] Evidence Localization for Pathology Images Using Weakly Supervised Learning
    Huang, Yongxiang
    Chung, Albert C. S.
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT I, 2019, 11764 : 613 - 621
  • [2] Weakly-Supervised Action Localization, and Action Recognition Using Global–Local Attention of 3D CNN
    Novanto Yudistira
    Muthu Subash Kavitha
    Takio Kurita
    International Journal of Computer Vision, 2022, 130 : 2349 - 2363
  • [3] Weakly supervised temporal action localization: a survey
    Li, Ronglu
    Zhang, Tianyi
    Zhang, Rubo
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (32) : 78361 - 78386
  • [4] Temporal Dropout for Weakly Supervised Action Localization
    Xie, Chi
    Zhuang, Zikun
    Zhao, Shengjie
    Liang, Shuang
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (03)
  • [5] Action Shuffling for Weakly Supervised Temporal Localization
    Zhang, Xiao-Yu
    Shi, Haichao
    Li, Changsheng
    Shi, Xinchu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4447 - 4457
  • [6] Learning Facial Action Units from Web Images with Scalable Weakly Supervised Clustering
    Zhao, Kaili
    Chu, Wen-Sheng
    Martinez, Aleix M.
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2090 - 2099
  • [7] Weakly Supervised Temporal Action Localization Using Deep Metric Learning
    Islam, Ashraful
    Radke, Richard J.
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 536 - 545
  • [8] UntrimmedNets for Weakly Supervised Action Recognition and Detection
    Wang, Limin
    Xiong, Yuanjun
    Lin, Dahua
    Van Gool, Luc
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6402 - 6411
  • [9] Weakly-Supervised Action Localization, and Action Recognition Using Global-Local Attention of 3D CNN
    Yudistira, Novanto
    Kavitha, Muthu Subash
    Kurita, Takio
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (10) : 2349 - 2363
  • [10] ACTION COHERENCE NETWORK FOR WEAKLY SUPERVISED TEMPORAL ACTION LOCALIZATION
    Zhai, Yuanhao
    Wang, Le
    Liu, Ziyi
    Zhang, Qilin
    Hua, Gang
    Zheng, Nanning
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3696 - 3700