Spotting Temporally Precise, Fine-Grained Events in Video

被引:10
|
作者
Hong, James [1 ]
Zhang, Haotian [1 ]
Gharbi, Michael [2 ]
Fisher, Matthew [2 ]
Fatahalian, Kayvon [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Adobe Res, San Francisco, CA USA
来源
基金
美国国家科学基金会;
关键词
Temporally precise spotting; Video understanding;
D O I
10.1007/978-3-031-19833-5_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce the task of spotting temporally precise, fine-grained events in video (detecting the precise moment in time events occur). Precise spotting requires models to reason globally about the full-time scale of actions and locally to identify subtle frame-to-frame appearance and motion differences that identify events during these actions. Surprisingly, we find that top performing solutions to prior video understanding tasks such as action detection and segmentation do not simultaneously meet both requirements. In response, we propose E2E-Spot, a compact, end-to-end model that performs well on the precise spotting task and can be trained quickly on a single GPU. We demonstrate that E2E-Spot significantly outperforms recent baselines adapted from the video action detection, segmentation, and spotting literature to the precise spotting task. Finally, we contribute new annotations and splits to several fine-grained sports action datasets to make these datasets suitable for future work on precise spotting.
引用
收藏
页码:33 / 51
页数:19
相关论文
共 50 条
  • [1] Fine-grained Audible Video Description
    Shen, Xuyang
    Li, Dong
    Zhou, Jinxing
    Qin, Zhen
    He, Bowen
    Han, Xiaodong
    Li, Aixuan
    Dai, Yuchao
    Kong, Lingpeng
    Wang, Meng
    Qiao, Yu
    Zhong, Yiran
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10585 - 10596
  • [2] Fine-Grained Scalable Video Caching
    Gong, Qiushi
    Woods, John W.
    Kar, Koushik
    Chakareski, Jacob
    2015 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2015, : 101 - 106
  • [3] Fine-Grained Video Retrieval With Scene Sketches
    Zuo, Ran
    Deng, Xiaoming
    Chen, Keqi
    Zhang, Zhengming
    Lai, Yu-Kun
    Liu, Fang
    Ma, Cuixia
    Wang, Hao
    Liu, Yong-Jin
    Wang, Hongan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3136 - 3149
  • [4] Favor: Fine-Grained Video Rate Adaptation
    He, Jian
    Qureshi, Mubashir Adnan
    Qiu, Lili
    Li, Jin
    Li, Feng
    Han, Lei
    PROCEEDINGS OF THE 9TH ACM MULTIMEDIA SYSTEMS CONFERENCE (MMSYS'18), 2018, : 64 - 75
  • [5] Fine-grained Video Captioning for Sports Narrative
    Yu, Huanyu
    Cheng, Shuo
    Ni, Bingbing
    Wang, Minsi
    Zhang, Jian
    Yang, Xiaokang
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6006 - 6015
  • [6] FIVR: Fine-Grained Incident Video Retrieval
    Kordopatis-Zilos, Giorgos
    Papadopoulos, Symeon
    Patras, Ioannis
    Kompatsiaris, Ioannis
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (10) : 2638 - 2652
  • [7] Fine-Grained Crowdsourcing for Fine-Grained Recognition
    Jia Deng
    Krause, Jonathan
    Li Fei-Fei
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 580 - 587
  • [8] TextBlock: Towards Scene Text Spotting without Fine-grained Detection
    Jin Wei
    Zhang, Yuan
    Zhou, Yu
    Zeng, Gangyan
    Qiao, Zhi
    Guo, Youhui
    Wu, Haiying
    Wang, Hongbin
    Wang, Weiping
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5892 - 5902
  • [9] Online video advertising based on fine-grained video tags
    Lu, Feng
    Wang, Zirui
    Liao, Xiaofei
    Jin, Hai
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2014, 51 (12): : 2733 - 2745
  • [10] Fine-grained scalable video caching for heterogeneous clients
    Liu, Jiangchuan
    Xu, Jianliang
    Chu, Xiaowen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (05) : 1011 - 1020