What Can Simple Arithmetic Operations Do for Temporal Modeling?

被引:1
|
作者
Wu, Wenhao [1 ,2 ]
Song, Yuxin [2 ]
Sun, Zhun [2 ]
Wang, Jingdong [2 ]
Xu, Chang [1 ]
Ouyang, Wanli [1 ,3 ]
机构
[1] Univ Sydney, Sydney, NSW, Australia
[2] Baidu Inc, Beijing, Peoples R China
[3] Shanghai AI Lab, Shanghai, Peoples R China
来源
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年
关键词
D O I
10.1109/ICCV51070.2023.01261
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal modeling plays a crucial role in understanding video content. To tackle this problem, previous studies built complicated temporal relations through time sequence thanks to the development of computationally powerful devices. In this work, we explore the potential of four simple arithmetic operations for temporal modeling. Specifically, we first capture auxiliary temporal cues by computing addition, subtraction, multiplication, and division between pairs of extracted frame features. Then, we extract corresponding features from these cues to benefit the original temporal-irrespective domain. We term such a simple pipeline as an Arithmetic Temporal Module (ATM), which operates on the stem of a visual backbone with a plug-and-play style. We conduct comprehensive ablation studies on the instantiation of ATMs and demonstrate that this module provides powerful temporal modeling capability at a low computational cost. Moreover, the ATM is compatible with both CNNs- and ViTs-based architectures. Our results show that ATM achieves superior performance over several popular video benchmarks. Specifically, on Something-Something V1, V2 and Kinetics-400, we reach top-1 accuracy of 65.6%, 74.6%, and 89.4% respectively. The code is available at https://github.com/whwu95/ATM.
引用
收藏
页码:13666 / 13676
页数:11
相关论文
共 50 条
  • [41] Ask Not What AI Can Do to Us, but What We Can Do with AI
    Shiohara, Tetsuo
    Mizukawa, Yoshiko
    JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY-IN PRACTICE, 2022, 10 (01): : 284 - 285
  • [42] Ask not what can you do for modularity but what can modularity do for you
    Anderson, M
    LEARNING AND INDIVIDUAL DIFFERENCES, 1998, 10 (03) : 251 - 257
  • [43] What Can It Be If Not a Simple Haemangioma?
    Yu, P. T.
    Luk, H. M.
    Lo, I. F. M.
    HONG KONG JOURNAL OF PAEDIATRICS, 2019, 24 (03) : 151 - 154
  • [44] Flexible Electronics: What can it do? What should it do?
    Venugopal, Sameer M.
    Allee, David R.
    Quevedo-Lopez, Manuel
    Gnade, Bruce
    Forsythe, Eric
    Morton, David
    2010 INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM, 2010, : 644 - 649
  • [45] WHAT CAN LIDAR DO AND HOW WELL CAN IT DO IT
    COONEY, J
    BULLETIN OF THE AMERICAN METEOROLOGICAL SOCIETY, 1975, 56 (01) : 68 - 70
  • [46] Basophils: what they 'can do' versus what they 'actually do'
    Booki Min
    Nature Immunology, 2008, 9 : 1333 - 1339
  • [47] Not what the computer can do, but what the learner must do
    Turner, David A.
    2011 14TH INTERNATIONAL CONFERENCE ON INTERACTIVE COLLABORATIVE LEARNING (ICL), 2011, : 382 - 385
  • [48] What Can God Do? What Should God Do?
    Horowitz, Amir
    RELIGIONS, 2022, 13 (12)
  • [49] Basophils: what they 'can do' versus what they 'actually do'
    Min, Booki
    NATURE IMMUNOLOGY, 2008, 9 (12) : 1333 - 1339
  • [50] What weapons do we have and what can they do?
    Wilson, Isaiah, III
    PS-POLITICAL SCIENCE & POLITICS, 2007, 40 (03) : 473 - 478