Cross-task weakly supervised learning from instructional videos

被引:78
|
作者
Zhukov, Dimitri [1 ,2 ]
Alayrac, Jean-Baptiste [1 ,3 ]
Cinbis, Ramazan Gokberk [4 ]
Fouhey, David [5 ]
Laptev, Ivan [1 ,2 ]
Sivic, Josef [1 ,2 ,6 ]
机构
[1] Inria, Rocquencourt, France
[2] PSL Res Univ, Ecole Normale Super, Dept Informat, Paris, France
[3] DeepMind, London, England
[4] Middle East Tech Univ, Ankara, Turkey
[5] Univ Michigan, Ann Arbor, MI 48109 USA
[6] Czech Tech Univ, CIIRC Czech Inst Informat Robot & Cybernet, Prague, Czech Republic
基金
欧洲研究理事会;
关键词
D O I
10.1109/CVPR.2019.00365
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we investigate learning visual models for the steps of ordinary tasks using weak supervision via instructional narrations and an ordered list of steps instead of strong supervision via temporal annotations. At the heart of our approach is the observation that weakly supervised learning may be easier if a model shares components while learning different steps: "pour egg" should be trained jointly with other tasks involving "pour" and "egg". We formalize this in a component model for recognizing steps and a weakly supervised learning framework that can learn this model under temporal constraints from narration and the list of steps. Past data does not permit systematic studying of sharing and so we also gather a new dataset, CrossTask, aimed at assessing cross-task sharing. Our experiments demonstrate that sharing across tasks improves performance, especially when done at the component level and that our component model can parse previously unseen tasks by virtue of its compositionality.
引用
下载
收藏
页码:3532 / 3540
页数:9
相关论文
共 50 条
  • [41] Spatiotemporal Super-Resolution with Cross-Task Consistency and its Semi-supervised Extension
    Lin, Han-Yi
    Hsiu, Pi-Cheng
    Kuo, Tei-Wei
    Lin, Yen-Yu
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 615 - 622
  • [42] Cross-Task Attention Network: Improving Multi-task Learning for Medical Imaging Applications
    Kim, Sangwook
    Purdie, Thomas G.
    McIntosh, Chris
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023 WORKSHOPS, 2023, 14393 : 119 - 128
  • [43] Cross-task feature enhancement strategy in multi-task learning for harvesting Sichuan pepper
    Wang, Yihan
    Deng, Xinglong
    Luo, Jianqiao
    Li, Bailin
    Xiao, Shide
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 207
  • [44] Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos
    Wu, Jie
    Li, Guanbin
    Han, Xiaoguang
    Lin, Liang
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1283 - 1291
  • [45] Mining Cross-Task Artifact Dependencies from Developer Interactions
    Ashraf, Usman
    Mayr-Dorn, Christoph
    Egyed, Alexander
    2019 IEEE 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), 2019, : 186 - 196
  • [46] Cross-task cue utilisation and situational awareness in learning to manage a simulated rail control task
    Joffe, Anthony D.
    Wiggins, Mark W.
    APPLIED ERGONOMICS, 2020, 89
  • [47] Detecting Fall Actions of Videos by Using Weakly-Supervised Learning and Unsupervised Clustering Learning
    Zhou, Jiaxin
    Komuro, Takashi
    ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT I, 2022, 13598 : 313 - 324
  • [48] Weakly supervised object localization and segmentation in videos
    Rochan, Mrigank
    Rahman, Shafin
    Bruce, Neil D. B.
    Wang, Yang
    IMAGE AND VISION COMPUTING, 2016, 56 : 1 - 12
  • [49] CROSS-TASK CROSS-TALK IN MEMORY AND PERCEPTION
    DUTTA, A
    SCHWEICKERT, R
    CHOI, S
    PROCTOR, RW
    ACTA PSYCHOLOGICA, 1995, 90 (1-3) : 49 - 62
  • [50] CROSSFIT : A Few-shot Learning Challenge for Cross-task Generalization in NLP
    Ye, Qinyuan
    Lin, Bill Yuchen
    Ren, Xiang
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 7163 - 7189