Latency Matters: Real-Time Action Forecasting Transformer

被引:5
|
作者
Girase, Harshayu [1 ,2 ]
Agarwal, Nakul [1 ]
Choi, Chiho [1 ]
Mangalam, Karttikeya [2 ]
机构
[1] Honda Res Inst USA, San Jose, CA USA
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
D O I
10.1109/CVPR52729.2023.01799
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present RAFTformer, a real-time action forecasting transformer for latency-aware real-world action forecasting. RAFTformer is a two-stage fully transformer based architecture comprising of a video transformer backbone that operates on high resolution, short-range clips, and a head transformer encoder that temporally aggregates information from multiple short-range clips to span a long-term horizon. Additionally, we propose a novel self-supervised shuffled causal masking scheme as a model level augmentation to improve forecasting fidelity. Finally, we also propose a novel real-time evaluation setting for action forecasting that directly couples model inference latency to overall forecasting performance and brings forth a hitherto overlooked trade-off between latency and action forecasting performance. Our parsimonious network design facilitates RAFTformer inference latency to be 9x smaller than prior works at the same forecasting accuracy. Owing to its two-staged design, RAFTformer uses 94% less training compute and 90% lesser training parameters to outperform prior state-of-the-art baselines by 4.9 points on EGTEA Gaze+ and by 1.4 points on EPIC-Kitchens-100 validation set, as measured by Top-5 recall (T5R) in the offline setting. In the real-time setting, RAFTformer outperforms prior works by an even greater margin of upto 4.4 T5R points on the EPIC-Kitchens-100 dataset. Project Webpage: https://karttikeya.github.io/publication/RAFTformer/.
引用
收藏
页码:18759 / 18769
页数:11
相关论文
共 50 条
  • [41] Designing a real-time recoverable action
    Moron, CE
    THIRD INTERNATIONAL WORKSHOP ON REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 1996, : 162 - 169
  • [42] Effect of data time interval on real-time flood forecasting
    Remesan, Renji
    Ahmadi, Azadeh
    Shamim, Muhammad Ali
    Han, Dawei
    JOURNAL OF HYDROINFORMATICS, 2010, 12 (04) : 396 - 407
  • [43] Real-time Fourier transformer based on fiber gratings
    Muriel, MA
    Azaña, J
    Carballar, A
    OPTICS LETTERS, 1999, 24 (01) : 1 - 3
  • [44] Real-Time Monitoring System for Transformer Based on GSM
    Zou, Li
    INFORMATION COMPUTING AND APPLICATIONS, ICICA 2013, PT I, 2013, 391 : 314 - 323
  • [45] Detailed real-time transient model of the "Sen" Transformer
    Asghari, Babak
    Faruque, M. Omar
    Dinavahi, Venkata
    IEEE TRANSACTIONS ON POWER DELIVERY, 2008, 23 (03) : 1513 - 1521
  • [46] REAL-TIME VECTOR PIPELINE COLOR SPACE TRANSFORMER
    ANDREADIS, I
    TSALIDES, P
    THANAILAKIS, A
    INTERNATIONAL JOURNAL OF ELECTRONICS, 1993, 74 (06) : 835 - 843
  • [47] Real-time Fourier transformer based on fiber gratings
    Tecnología Fotónica, Escuela Técnica Superior de Ingenieros Telecomunicación, Universidad Politécnica Madrid, Ciudad Universitaria s/n, E28040 Madrid, Spain
    Opt. Lett., 1 (1-3):
  • [48] RVSRT: Real-time Video Super Resolution Transformer
    Ou, Linlin
    Chen, Yuanping
    FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
  • [49] Detailed Real-Time Transient Model of the "Sen" Transformer
    Asghari, Babak
    Faruque, Omar
    Dinavai, Venkata
    2008 IEEE POWER & ENERGY SOCIETY GENERAL MEETING, VOLS 1-11, 2008, : 1758 - 1758
  • [50] Digital transformation of hospital quality and safety: real-time data for real-time action
    Barnett, Amy
    Winning, Michelle
    Canaris, Stephen
    Cleary, Michael
    Staib, Andrew
    Sullivan, Clair
    AUSTRALIAN HEALTH REVIEW, 2019, 43 (06) : 656 - 661