What Do I Annotate Next? An Empirical Study of Active Learning for Action Localization

被引:19
|
作者
Heilbron, Fabian Caba [1 ]
Lee, Joon-Young [2 ]
Jin, Hailin [2 ]
Ghanem, Bernard [1 ]
机构
[1] King Abdullah Univ Sci & Technol KAUST, Thuwal, Saudi Arabia
[2] Adobe Res, San Jose, CA USA
来源
关键词
Video understanding; Temporal action localization; Active learning; Video annotation;
D O I
10.1007/978-3-030-01252-6_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite tremendous progress achieved in temporal action localization, state-of-the-art methods still struggle to train accurate models when annotated data is scarce. In this paper, we introduce a novel active learning framework for temporal localization that aims to mitigate this data dependency issue. We equip our framework with active selection functions that can reuse knowledge from previously annotated datasets. We study the performance of two state-of-the-art active selection functions as well as two widely used active learning baselines. To validate the effectiveness of each one of these selection functions, we conduct simulated experiments on ActivityNet. We find that using previously acquired knowledge as a bootstrapping source is crucial for active learners aiming to localize actions. When equipped with the right selection function, our proposed framework exhibits significantly better performance than standard active learning strategies, such as uncertainty sampling. Finally, we employ our framework to augment the newly compiled Kinetics action dataset with ground-truth temporal annotations. As a result, we collect Kinetics-Localization, a novel large-scale dataset for temporal action localization, which contains more than 15K YouTube videos.
引用
收藏
页码:212 / 229
页数:18
相关论文
共 50 条
  • [41] Do what I do and "Do how I do": Different components of imitative learning are mediated by different neural structures
    Petrosini, Laura
    NEUROSCIENTIST, 2007, 13 (04): : 335 - 348
  • [42] Towards Exploring the Limitations of Active Learning: An Empirical Study
    Hu, Qiang
    Quo, Yuejun
    Cordy, Maxime
    Xie, Xiaofei
    Ma, Wei
    Papadakis, Mike
    Le Traon, Yves
    2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, 2021, : 917 - 929
  • [43] Learning From What We Do, and Doing What We Learn: A Learning Health Care System in Action
    Lindsell, Christopher J.
    Gatto, Cheryl L.
    Dear, Mary Lynn
    Buie, Reagan
    Rice, Todd W.
    Pulley, Jill M.
    Hartert, Tina V.
    Kripalani, Sunil
    Harrell, Frank E.
    Byrne, Daniel W.
    Edgeworth, Mitchell C.
    Steaban, Robin
    Dittus, Robert S.
    Bernard, Gordon R.
    ACADEMIC MEDICINE, 2021, 96 (09) : 1291 - 1299
  • [44] How to study civil society: The state of the art and what to do next
    Kubik, J
    EAST EUROPEAN POLITICS AND SOCIETIES, 2005, 19 (01) : 105 - 120
  • [45] What if we do not have multiple videos of the same action? -Video Action Localization Using Web Images
    Sultani, Waqas
    Shah, Mubarak
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1077 - 1085
  • [46] DO WHAT I MEAN, NOT WHAT I SAY - CHANGES IN MOTHERS ACTION-DIRECTIVES TO YOUNG-CHILDREN
    SCHNEIDERMAN, MH
    JOURNAL OF CHILD LANGUAGE, 1983, 10 (02) : 357 - 367
  • [48] What should I do next? Using shared representations to solve interaction problems
    Giovanni Pezzulo
    Haris Dindo
    Experimental Brain Research, 2011, 211 : 613 - 630
  • [49] Am I Dying?! A Complete Guide to Your Symptoms-and What To Do Next
    Eastwood, Elizabeth J.
    LIBRARY JOURNAL, 2018, 143 (20) : 87 - 87
  • [50] Do I Know What I'm Doing? Cognitive Dissonance and Action Identification Theory
    Fointiat, Valerie
    Pelt, Audrey
    SPANISH JOURNAL OF PSYCHOLOGY, 2015, 18