Cross-view action recognition understanding from exocentric to egocentric perspective

被引:0
|
作者
Truong, Thanh-Dat [1 ]
Luu, Khoa [1 ]
机构
[1] Univ Arkansas, Comp Vis & Image Understanding Lab, Fayetteville, AR 72701 USA
基金
美国国家科学基金会;
关键词
Cross-view action recognition; Self-attention; Egocentric action recognition; ATTENTION;
D O I
10.1016/j.neucom.2024.128731
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding action recognition in egocentric videos has emerged as a vital research topic with numerous practical applications. With the limitation in the scale of egocentric data collection, learning robust learning-based action recognition models remains difficult. Transferring knowledge learned from the scale exocentric data to the egocentric data is challenging due to the difference in videos across views. work introduces a novel cross-view learning approach to action recognition (CVAR) that effectively transfers knowledge from the exocentric to the selfish view. First, we present a novel geometric-based constraint the self-attention mechanism in Transformer based on analyzing the camera positions between two Then, we propose a new cross-view self-attention loss learned on unpaired cross-view data to enforce the attention mechanism learning to transfer knowledge across views. Finally, to further improve the performance of our cross-view learning approach, we present the metrics to measure the correlations in videos and attention maps effectively. Experimental results on standard egocentric action recognition benchmarks, i.e., Charades Ego, EPIC-Kitchens-55, and EPIC-Kitchens-100, have shown our approach's effectiveness and state-of-the-art performance.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning
    Zhang, Chengkun
    Zheng, Huicheng
    Lai, Jianhuang
    IEEE ACCESS, 2018, 6 : 16855 - 16868
  • [32] Cross-View Action Recognition Using Contextual Maximum Margin Clustering
    Zhang, Zhong
    Wang, Chunheng
    Xiao, Baihua
    Zhou, Wen
    Liu, Shuang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2014, 24 (10) : 1663 - 1668
  • [33] Multiple Continuous Virtual Paths Based Cross-View Action Recognition
    Zhang, Zhong
    Liu, Shuang
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2016, 30 (05)
  • [34] Cross-view human action recognition from depth maps using spectral graph sequences
    Kerola, Tommi
    Inoue, Nakamasa
    Shinoda, Koichi
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 154 : 108 - 126
  • [35] Dual-codebook learning and hierarchical transfer for cross-view action recognition
    Zhang, Chengkun
    Zheng, Huicheng
    Lai, Jianhuang
    JOURNAL OF ELECTRONIC IMAGING, 2018, 27 (04)
  • [36] EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding
    Li, Yuan-Ming
    Huang, Wei-Jin
    Wang, An-Lan
    Zeng, Ling-An
    Meng, Jing-Ke
    Zheng, Wei-Shi
    COMPUTER VISION - ECCV 2024, PT XX, 2025, 15078 : 363 - 382
  • [37] Global-Local Cross-View Fisher Discrimination for View-invariant Action Recognition
    Gao, Lingling
    Ji, Yanli
    Yang, Yang
    Shen, Heng Tao
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5255 - 5264
  • [38] WEAKLY SUPERVISED CROSS-VIEW ACTION RECOGNITION VIA SEQUENTIAL MOTION ACCUMULATION
    Liu, Yi
    Qin, Lei
    Cheng, Zhongwei
    Zhang, Yanhao
    Zhang, Weigang
    Huang, Qingming
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 2383 - 2387
  • [39] Bilayer model for cross-view human action recognition based on transfer learning
    Li, Yandi
    Xu, Xiping
    Xu, Jiahong
    Du, Enyu
    JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (03)
  • [40] Mining Discriminative 3D Poselet for Cross-view Action Recognition
    Wang, Jiang
    Nie, Xiaohan
    Xia, Yin
    Wu, Ying
    2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 634 - 639