Cross-view action recognition understanding from exocentric to egocentric perspective

被引:0
|
作者
Truong, Thanh-Dat [1 ]
Luu, Khoa [1 ]
机构
[1] Univ Arkansas, Comp Vis & Image Understanding Lab, Fayetteville, AR 72701 USA
基金
美国国家科学基金会;
关键词
Cross-view action recognition; Self-attention; Egocentric action recognition; ATTENTION;
D O I
10.1016/j.neucom.2024.128731
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding action recognition in egocentric videos has emerged as a vital research topic with numerous practical applications. With the limitation in the scale of egocentric data collection, learning robust learning-based action recognition models remains difficult. Transferring knowledge learned from the scale exocentric data to the egocentric data is challenging due to the difference in videos across views. work introduces a novel cross-view learning approach to action recognition (CVAR) that effectively transfers knowledge from the exocentric to the selfish view. First, we present a novel geometric-based constraint the self-attention mechanism in Transformer based on analyzing the camera positions between two Then, we propose a new cross-view self-attention loss learned on unpaired cross-view data to enforce the attention mechanism learning to transfer knowledge across views. Finally, to further improve the performance of our cross-view learning approach, we present the metrics to measure the correlations in videos and attention maps effectively. Experimental results on standard egocentric action recognition benchmarks, i.e., Charades Ego, EPIC-Kitchens-55, and EPIC-Kitchens-100, have shown our approach's effectiveness and state-of-the-art performance.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Cross-View Action Recognition Over Heterogeneous Feature Spaces
    Wu, Xinxiao
    Wang, Han
    Liu, Cuiwei
    Jia, Yunde
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (11) : 4096 - 4108
  • [22] Cross-View Action Recognition via Transferable Dictionary Learning
    Zheng, Jingjing
    Jiang, Zhuolin
    Chellappa, Rama
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (06) : 2542 - 2556
  • [23] Cross-View Action Recognition via a Continuous Virtual Path
    Zhang, Zhong
    Wang, Chunheng
    Xiao, Baihua
    Zhou, Wen
    Liu, Shuang
    Shi, Cunzhao
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 2690 - 2697
  • [24] Deeply Learned View-Invariant Features for Cross-View Action Recognition
    Kong, Yu
    Ding, Zhengming
    Li, Jun
    Fu, Yun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (06) : 3028 - 3037
  • [25] Learning View-invariant Sparse Representations for Cross-view Action Recognition
    Zheng, Jingjing
    Jiang, Zhuolin
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 3176 - 3183
  • [26] Hierarchically Learned View-Invariant Representations for Cross-View Action Recognition
    Liu, Yang
    Lu, Zhaoyang
    Li, Jing
    Yang, Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (08) : 2416 - 2430
  • [27] Deep Cross-view Convolutional Features for View-invariant Action Recognition
    Ulhaq, Anwaar
    2018 IEEE THIRD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, APPLICATIONS AND SYSTEMS (IPAS), 2018, : 137 - 142
  • [28] Cross-domain learned view-invariant representation for cross-view action recognition
    Li, Yandi
    Li, Mengdi
    Zhao, Zhihao
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)
  • [29] Learning Representations From Skeletal Self-Similarities for Cross-View Action Recognition
    Shao, Zhanpeng
    Li, Youfu
    Zhang, Hong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (01) : 160 - 174
  • [30] Correspondence-Free Dictionary Learning for Cross-View Action Recognition
    Zhu, Fan
    Shao, Ling
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 4525 - 4530