Spotting Temporally Precise, Fine-Grained Events in Video

被引:10
|
作者
Hong, James [1 ]
Zhang, Haotian [1 ]
Gharbi, Michael [2 ]
Fisher, Matthew [2 ]
Fatahalian, Kayvon [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Adobe Res, San Francisco, CA USA
来源
基金
美国国家科学基金会;
关键词
Temporally precise spotting; Video understanding;
D O I
10.1007/978-3-031-19833-5_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce the task of spotting temporally precise, fine-grained events in video (detecting the precise moment in time events occur). Precise spotting requires models to reason globally about the full-time scale of actions and locally to identify subtle frame-to-frame appearance and motion differences that identify events during these actions. Surprisingly, we find that top performing solutions to prior video understanding tasks such as action detection and segmentation do not simultaneously meet both requirements. In response, we propose E2E-Spot, a compact, end-to-end model that performs well on the precise spotting task and can be trained quickly on a single GPU. We demonstrate that E2E-Spot significantly outperforms recent baselines adapted from the video action detection, segmentation, and spotting literature to the precise spotting task. Finally, we contribute new annotations and splits to several fine-grained sports action datasets to make these datasets suitable for future work on precise spotting.
引用
收藏
页码:33 / 51
页数:19
相关论文
共 50 条
  • [41] Fine-Grained Cryptography
    Degwekar, Akshay
    Vaikuntanathan, Vinod
    Vasudevan, Prashant Nalini
    ADVANCES IN CRYPTOLOGY (CRYPTO 2016), PT III, 2016, 9816 : 533 - 562
  • [42] A Case for Precise, Fine-Grained Pointer Synthesis in High-Level Synthesis
    Ramanathan, Nadesh
    Constantinides, George A.
    Wickerson, John
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2022, 27 (04)
  • [43] Learning Fine-Grained Features for Pixel-wise Video Correspondences
    Li, Rui
    Zhou, Shenglong
    Liu, Dong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 9598 - 9607
  • [44] Video deblocking with fine-grained scalable complexity for embedded mobile computing
    Yu, ZH
    Zhang, J
    2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 1173 - 1178
  • [45] A Fine-Grained Spatial-Temporal Attention Model for Video Captioning
    Liu, An-An
    Qiu, Yurui
    Wong, Yongkang
    Su, Yu-Ting
    Kankanhalli, Mohan
    IEEE ACCESS, 2018, 6 : 68463 - 68471
  • [46] Fine-grained rate shaping for video streaming over wireless networks
    Chen, TPC
    Chen, TH
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO AND ELECTROACOUSTICS MULTIMEDIA SIGNAL PROCESSING, 2003, : 688 - 691
  • [47] Multiple-Level Distillation for Video Fine-Grained Accident Detection
    Yu, Hongyang
    Zhang, Xinfeng
    Wang, Yaowei
    Huang, Qingming
    Yin, Baocai
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) : 4445 - 4457
  • [48] Pose-Guided Fine-Grained Sign Language Video Generation
    Shi, Tongkai
    Hu, Lianyu
    Shang, Fanhua
    Feng, Jichao
    Liu, Peidong
    Feng, Wei
    COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 392 - 409
  • [49] Smart Trimming of Video from Edge, for Fine-grained Adaptive Multicast
    Zaheer, Amer
    Naz, Shahneela
    Rasheed, Asim
    Khaliq, Kishwer A.
    Javed, Touseef
    Qayyum, Amir
    2013 IEEE 9TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES (ICET 2013), 2013, : 373 - 378
  • [50] Discriminative Segment Focus Network for Fine-grained Video Action Recognition
    Sun, Baoli
    Ye, Xinchen
    Yan, Tiantian
    Wang, Zhihui
    Li, Haojie
    Wang, Zhiyong
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (07)