Cross-view action recognition understanding from exocentric to egocentric perspective

被引:0
|
作者
Truong, Thanh-Dat [1 ]
Luu, Khoa [1 ]
机构
[1] Univ Arkansas, Comp Vis & Image Understanding Lab, Fayetteville, AR 72701 USA
基金
美国国家科学基金会;
关键词
Cross-view action recognition; Self-attention; Egocentric action recognition; ATTENTION;
D O I
10.1016/j.neucom.2024.128731
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding action recognition in egocentric videos has emerged as a vital research topic with numerous practical applications. With the limitation in the scale of egocentric data collection, learning robust learning-based action recognition models remains difficult. Transferring knowledge learned from the scale exocentric data to the egocentric data is challenging due to the difference in videos across views. work introduces a novel cross-view learning approach to action recognition (CVAR) that effectively transfers knowledge from the exocentric to the selfish view. First, we present a novel geometric-based constraint the self-attention mechanism in Transformer based on analyzing the camera positions between two Then, we propose a new cross-view self-attention loss learned on unpaired cross-view data to enforce the attention mechanism learning to transfer knowledge across views. Finally, to further improve the performance of our cross-view learning approach, we present the metrics to measure the correlations in videos and attention maps effectively. Experimental results on standard egocentric action recognition benchmarks, i.e., Charades Ego, EPIC-Kitchens-55, and EPIC-Kitchens-100, have shown our approach's effectiveness and state-of-the-art performance.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Evaluation of local spatial-temporal features for cross-view action recognition
    Gao, Zan
    Nie, Weizhi
    Liu, Anan
    Zhang, Hua
    NEUROCOMPUTING, 2016, 173 : 110 - 117
  • [42] A Geometric Approach for Cross-View Human Action Recognition using Deep Learning
    Papadakis, Antonios
    Mathe, Eirini
    Spyrou, Evaggelos
    Mylonas, Phivos
    PROCEEDINGS OF THE 2019 11TH INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS (ISPA 2019), 2019, : 258 - 263
  • [43] Topic-Based Knowledge Transfer Algorithm for Cross-View Action Recognition
    Chen, Changhong
    Yang, Shunqing
    Gan, Zongliang
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (03) : 614 - 617
  • [44] SSM-Based Joint Dictionary Learning for Cross-View Action Recognition
    Wang, Lei
    Liu, Zhigang
    Wang, Ruoshi
    Qi, Haoyang
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 1628 - 1632
  • [45] Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition
    Huang, Yi
    Yang, Xiaoshan
    Gao, Junyun
    Xu, Changsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2273 - 2286
  • [46] Cross-view Activity Recognition using Hankelets
    Li, Binlong
    Camps, Octavia I.
    Sznaier, Mario
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 1362 - 1369
  • [47] Learning a Non-linear Knowledge Transfer Model for Cross-View Action Recognition
    Rahmani, Hossein
    Mian, Ajmal
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 2458 - 2466
  • [48] Class-Constrained Transfer LDA for Cross-View Action Recognition in Internet of Things
    Liu, Shuang
    IEEE INTERNET OF THINGS JOURNAL, 2018, 5 (05): : 3270 - 3277
  • [49] Cross-view Action Recognition via Dual-Codebook and Hierarchical Transfer Framework
    Zhang, Chengkun
    Zheng, Huicheng
    Lai, Jianhuang
    COMPUTER VISION - ACCV 2014, PT V, 2015, 9007 : 579 - 592
  • [50] Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition
    Huang, Yi
    Yang, Xiaoshan
    Gao, Junyun
    Xu, Changsheng
    IEEE Transactions on Multimedia, 2022, 24 : 2273 - 2286