Is an Object-Centric Video Representation Beneficial for Transfer?

被引:0
|
作者
Zhang, Chuhan [1 ]
Gupta, Ankush [2 ]
Zisserman, Andrew [1 ]
机构
[1] Univ Oxford, Dept Engn Sci, Visual Geometry Grp, Oxford, England
[2] DeepMind, London, England
来源
基金
英国工程与自然科学研究理事会;
关键词
Video action recognition; Object centric representations; Transfer learning;
D O I
10.1007/978-3-031-26316-3_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The objective of this work is to learn an object-centric video representation, with the aim of improving transferability to novel tasks, i.e., tasks different from the pre-training task of action classification. To this end, we introduce a new object-centric video recognition model based on a transformer architecture. The model learns a set of object-centric summary vectors for the video, and uses these vectors to fuse the visual and spatio-temporal trajectory 'modalities' of the video clip. We also introduce a novel trajectory contrast loss to further enhance objectness in these summary vectors. With experiments on four datasets-SomethingSomething-V2, Something-Else, Action Genome and EpicKitchens-we show that the object-centric model outperforms prior video representations (both object-agnostic and object-aware), when: (1) classifying actions on unseen objects and unseen environments; (2) low-shot learning of novel classes; (3) linear probe to other downstream tasks; as well as (4) for standard action classification.
引用
收藏
页码:379 / 397
页数:19
相关论文
共 50 条
  • [41] Uni-and-Bi-Directional Video Prediction via Learning Object-Centric Transformation
    Chen, Xiongtao
    Wang, Wenmin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (06) : 1591 - 1604
  • [42] Semantic Tracklets: An Object-Centric Representation for Visual Multi-Agent Reinforcement Learning
    Liu, Iou-Jen
    Ren, Zhongzheng
    Yeh, Raymond A.
    Schwing, Alexander G.
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 5603 - 5610
  • [43] Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video
    Ionescu, Radu Tudor
    Khan, Fahad Shahbaz
    Georgescu, Mariana-Iuliana
    Shao, Ling
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7834 - 7843
  • [44] Time-traveling object-centric breakpoints
    Bourcier, Valentin
    Costiou, Steven
    Santander, Maximilian Ignacio Willembrinck
    Vanegue, Adrien
    Etien, Anne
    JOURNAL OF COMPUTER LANGUAGES, 2024, 80
  • [45] Deep Object-Centric Policies for Autonomous Driving
    Wang, Dequan
    Devin, Coline
    Cai, Qi-Zhi
    Yu, Fisher
    Darrell, Trevor
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 8853 - 8859
  • [46] Manifold geometric invariants and object-centric approach
    Jannson, TP
    APPLICATIONS AND SCIENCE OF NEURAL NETWORKS, FUZZY SYSTEMS, AND EVOLUTIONARY COMPUTATION V, 2002, 4787 : 158 - 173
  • [47] Generalization and Robustness Implications in Object-Centric Learning
    Dittadi, Andrea
    Papa, Samuele
    De Vita, Michele
    Scholkopf, Bernhard
    Winther, Ole
    Locatello, Francesco
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [48] An Object-Centric Paradigm for Robot Programming by Demonstration
    Huang, Di-Wei
    Katz, Garrett E.
    Langsfeld, Joshua D.
    Oh, Hyuk
    Gentili, Rodolphe J.
    Reggia, James A.
    FOUNDATIONS OF AUGMENTED COGNITION, AC 2015, 2015, 9183 : 745 - 756
  • [49] Precision and Fitness in Object-Centric Process Mining
    Adams, Jan Niklas
    van der Aalst, Wil M. P.
    2021 3RD INTERNATIONAL CONFERENCE ON PROCESS MINING (ICPM 2021), 2021, : 128 - 135
  • [50] SSVEP stimuli design for object-centric BCI
    Gergondet, Pierre
    Kheddar, Abderrahmane
    BRAIN-COMPUTER INTERFACES, 2015, 2 (01) : 11 - 28