Implicitly using Human Skeleton in Self-supervised Learning: Influence on Spatio-temporal Puzzle Solving and on Video Action Recognition

被引:0
|
作者
Riand, Mathieu [1 ,2 ]
Dolle, Laurent [1 ]
Le Callet, Patrick [2 ]
机构
[1] CEA Tech Pays Loire, F-44340 Bouguenais, France
[2] Nantes Univ, Lab Sci Numer Nantes, Equipe Image Percept & Interact, Nantes, France
关键词
Self-supervised Learning; Siamese Network; Skeleton Keypoints; Action Recognition; Few-shot Learning;
D O I
10.5220/0010689500003061
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we studied the influence of adding skeleton data on top of human actions videos when performing self-supervised learning and action recognition. We show that adding this information without additional constraints actually hurts the accuracy of the network; we argue that the added skeleton is not considered by the network and seen as a noise masking part of the natural image. We bring first results on puzzle solving and video action recognition to support this hypothesis.
引用
收藏
页码:128 / 135
页数:8
相关论文
共 50 条
  • [21] Self-supervised vessel trajectory segmentation via learning spatio-temporal semantics
    Zhang, Rui
    Ren, Haitao
    Yu, Zhipei
    Xiao, Zhu
    Liu, Kezhong
    Jiang, Hongbo
    [J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2024,
  • [22] Self-Supervised Spatio-Temporal Representation Learning of Satellite Image Time Series
    Dumeur, Iris
    Valero, Silvia
    Inglada, Jordi
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 4350 - 4367
  • [23] Joint spatio-temporal features constrained self-supervised electrocardiogram representation learning
    Ao Ran
    Huafeng Liu
    [J]. Biomedical Engineering Letters, 2024, 14 : 209 - 220
  • [24] Spatio-Temporal Information for Action Recognition in Thermal Video Using Deep Learning Model
    Srihari, P.
    Harikiran, J.
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2022, 13 (08) : 669 - 680
  • [25] Learning spatio-temporal features for action recognition from the side of the video
    Pei, Lishen
    Ye, Mao
    Zhao, Xuezhuan
    Xiang, Tao
    Li, Tao
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2016, 10 (01) : 199 - 206
  • [26] Learning spatio-temporal features for action recognition from the side of the video
    Lishen Pei
    Mao Ye
    Xuezhuan Zhao
    Tao Xiang
    Tao Li
    [J]. Signal, Image and Video Processing, 2016, 10 : 199 - 206
  • [27] Human Action Recognition Using Spatio-temporal Classification
    Fang, Chin-Hsien
    Chen, Ju-Chin
    Tseng, Chien-Chung
    Lien, Jenn-Jier James
    [J]. COMPUTER VISION - ACCV 2009, PT II, 2010, 5995 : 98 - 109
  • [28] Skeleton-based Human Action Recognition Using Spatio-Temporal Geometry ( ICCAS 2019)
    Ryu, Hanna
    Kim, Seong-heum
    Hwang, Youngbae
    [J]. 2019 19TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2019), 2019, : 329 - 332
  • [29] Transforming spatio-temporal self-attention using action embedding for skeleton-based action recognition
    Ahmad, Tasweer
    Rizvi, Syed Tahir Hussain
    Kanwal, Neel
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
  • [30] Human Action Recognition in Video by Fusion of Structural and Spatio-temporal Features
    Borzeshi, Ehsan Zare
    Concha, Oscar Perez
    Piccardi, Massimo
    [J]. STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2012, 7626 : 474 - 482