Multitask Learning to Improve Egocentric Action Recognition

被引:16
|
作者
Kapidis, Georgios [1 ,2 ]
Poppe, Ronald [2 ]
van Dam, Elsbeth [1 ]
Noldus, Lucas [1 ]
Veltkamp, Remco [2 ]
机构
[1] Noldus Informat Technol, Wageningen, Netherlands
[2] Univ Utrecht, Dept Informat & Comp Sci, Utrecht, Netherlands
关键词
D O I
10.1109/ICCVW.2019.00540
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work we employ multitask learning to capitalize on the structure that exists in related supervised tasks to train complex neural networks. It allows training a network for multiple objectives in parallel, in order to improve performance on at least one of them by capitalizing on a shared representation that is developed to accommodate more information than it otherwise would for a single task. We employ this idea to tackle action recognition in egocentric videos by introducing additional supervised tasks. We consider learning the verbs and nouns from which action labels consist of and predict coordinates that capture the hand locations and the gaze-based visual saliency for all the frames of the input video segments. This forces the network to explicitly focus on cues from secondary tasks that it might otherwise have missed resulting in improved inference. Our experiments on EPIC-Kitchens and EGTEA Gaze+ show consistent improvements when training with multiple tasks over the single-task baseline. Furthermore, in EGTEA Gaze+ we outperform the state-of-the-art in action recognition by 3.84%. Apart from actions, our method produces accurate hand and gaze estimations as side tasks, without requiring any additional input at test time other than the RGB video clips.
引用
收藏
页码:4396 / 4405
页数:10
相关论文
共 50 条
  • [1] Learning Spatiotemporal Attention for Egocentric Action Recognition
    Lu, Minlong
    Liao, Danping
    Li, Ze-Nian
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4425 - 4434
  • [2] Interactive Prototype Learning for Egocentric Action Recognition
    Wang, Xiaohan
    Zhu, Linchao
    Wang, Heng
    Yang, Yi
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8148 - 8157
  • [3] Egocentric Daily Activity Recognition via Multitask Clustering
    Yan, Yan
    Ricci, Elisa
    Liu, Gaowen
    Sebe, Nicu
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (10) : 2984 - 2995
  • [4] Latent Multitask Learning for View-Invariant Action Recognition
    Mahasseni, Behrooz
    Todorovic, Sinisa
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 3128 - 3135
  • [5] Slowfast Diversity-aware Prototype Learning for Egocentric Action Recognition
    Dai, Guangzhao
    Shu, Xiangbo
    Yan, Rui
    Huang, Peng
    Tang, Jinhui
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7549 - 7558
  • [6] Multi-Dataset, Multitask Learning of Egocentric Vision Tasks
    Kapidis, Georgios
    Poppe, Ronald
    Veltkamp, Remco C.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 6618 - 6630
  • [7] Multimodal Distillation for Egocentric Action Recognition
    Radevski, Gorjan
    Grujicic, Dusan
    Blaschko, Matthew
    Moens, Marie-Francine
    Tuytelaars, Tinne
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5190 - 5201
  • [8] TWOHANDSMUSIC: MULTITASK LEARNING-BASED EGOCENTRIC PIANO-PLAYING GESTURE RECOGNITION SYSTEM FOR TWO HANDS
    Seo, Kyeongeun
    Cho, Hyeonjoong
    Choi, Daewoong
    Lee, Sangyub
    Lee, Jaekyu
    Ko, Jaejin
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 4614 - 4618
  • [9] AN END-TO-END MULTITASK LEARNING MODEL TO IMPROVE SPEECH EMOTION RECOGNITION
    Fu, Changzeng
    Liu, Chaoran
    Ishi, Carlos Toshinori
    Ishiguro, Hiroshi
    [J]. 28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 351 - 355
  • [10] Knowledge guided learning: Open world egocentric action recognition with zero supervision
    Aakur, Sathyanarayanan N.
    Kundu, Sanjoy
    Gunti, Nikhil
    [J]. PATTERN RECOGNITION LETTERS, 2022, 156 : 38 - 45