Is an Object-Centric Video Representation Beneficial for Transfer?

被引：0

作者：

Zhang, Chuhan ^{[1
]}

Gupta, Ankush ^{[2
]}

Zisserman, Andrew ^{[1
]}

机构：

[1] Univ Oxford, Dept Engn Sci, Visual Geometry Grp, Oxford, England

[2] DeepMind, London, England

来源：

COMPUTER VISION - ACCV 2022, PT IV | 2023年 / 13844卷

基金：

英国工程与自然科学研究理事会;

关键词：

Video action recognition; Object centric representations; Transfer learning;

D O I：

10.1007/978-3-031-26316-3_23

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The objective of this work is to learn an object-centric video representation, with the aim of improving transferability to novel tasks, i.e., tasks different from the pre-training task of action classification. To this end, we introduce a new object-centric video recognition model based on a transformer architecture. The model learns a set of object-centric summary vectors for the video, and uses these vectors to fuse the visual and spatio-temporal trajectory 'modalities' of the video clip. We also introduce a novel trajectory contrast loss to further enhance objectness in these summary vectors. With experiments on four datasets-SomethingSomething-V2, Something-Else, Action Genome and EpicKitchens-we show that the object-centric model outperforms prior video representations (both object-agnostic and object-aware), when: (1) classifying actions on unseen objects and unseen environments; (2) low-shot learning of novel classes; (3) linear probe to other downstream tasks; as well as (4) for standard action classification.

引用

页码：379 / 397

页数：19

共 50 条

[41] Uni-and-Bi-Directional Video Prediction via Learning Object-Centric Transformation
Chen, Xiongtao
Wang, Wenmin
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (06) : 1591 - 1604
[42] Semantic Tracklets: An Object-Centric Representation for Visual Multi-Agent Reinforcement Learning
Liu, Iou-Jen
Ren, Zhongzheng
Yeh, Raymond A.
Schwing, Alexander G.
2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 5603 - 5610
[43] Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video
Ionescu, Radu Tudor
Khan, Fahad Shahbaz
Georgescu, Mariana-Iuliana
Shao, Ling
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7834 - 7843
[44] Time-traveling object-centric breakpoints
Bourcier, Valentin
Costiou, Steven
Santander, Maximilian Ignacio Willembrinck
Vanegue, Adrien
Etien, Anne
JOURNAL OF COMPUTER LANGUAGES, 2024, 80
[45] Deep Object-Centric Policies for Autonomous Driving
Wang, Dequan
Devin, Coline
Cai, Qi-Zhi
Yu, Fisher
Darrell, Trevor
2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 8853 - 8859
[46] Manifold geometric invariants and object-centric approach
Jannson, TP
APPLICATIONS AND SCIENCE OF NEURAL NETWORKS, FUZZY SYSTEMS, AND EVOLUTIONARY COMPUTATION V, 2002, 4787 : 158 - 173
[47] Generalization and Robustness Implications in Object-Centric Learning
Dittadi, Andrea
Papa, Samuele
De Vita, Michele
Scholkopf, Bernhard
Winther, Ole
Locatello, Francesco
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[48] An Object-Centric Paradigm for Robot Programming by Demonstration
Huang, Di-Wei
Katz, Garrett E.
Langsfeld, Joshua D.
Oh, Hyuk
Gentili, Rodolphe J.
Reggia, James A.
FOUNDATIONS OF AUGMENTED COGNITION, AC 2015, 2015, 9183 : 745 - 756
[49] Precision and Fitness in Object-Centric Process Mining
Adams, Jan Niklas
van der Aalst, Wil M. P.
2021 3RD INTERNATIONAL CONFERENCE ON PROCESS MINING (ICPM 2021), 2021, : 128 - 135
[50] SSVEP stimuli design for object-centric BCI
Gergondet, Pierre
Kheddar, Abderrahmane
BRAIN-COMPUTER INTERFACES, 2015, 2 (01) : 11 - 28

← 1 2 3 4 5 →