Multi-View Action Recognition using Contrastive Learning

被引:16
|
作者
Shah, Ketul [1 ]
Shah, Anshul [1 ]
Lau, Chun Pong [1 ]
de Melo, Celso M. [2 ]
Chellappa, Rama [1 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] DEVCOM Army Res Lab, Adelphi, MD USA
关键词
D O I
10.1109/WACV56688.2023.00338
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we present a method for RGB-based action recognition using multi-view videos. We present a supervised contrastive learning framework to learn a feature embedding robust to changes in viewpoint, by effectively leveraging multi-view data. We use an improved supervised contrastive loss and augment the positives with those coming from synchronized viewpoints. We also propose a new approach to use classifier probabilities to guide the selection of hard negatives in the contrastive loss, to learn a more discriminative representation. Negative samples from confusing classes based on posterior are weighted higher. We also show that our method leads to better domain generalization compared to the standard supervised training based on synthetic multi-view data. Extensive experiments on real (NTU-60, NTU-120, NUMA) and synthetic (RoCoG) data demonstrate the effectiveness of our approach.
引用
收藏
页码:3370 / 3380
页数:11
相关论文
共 50 条
  • [11] Regularized Multi-View Multi-Metric Learning for Action Recognition
    Wu, Xuqing
    Shah, Shishir K.
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 471 - 476
  • [12] Emotion-aware Multi-view Contrastive Learning for Facial Emotion Recognition
    Kim, Daeha
    Song, Byung Cheol
    COMPUTER VISION, ECCV 2022, PT XIII, 2022, 13673 : 178 - 195
  • [13] Annealing Temporal–Spatial Contrastive Learning for multi-view Online Action Detection
    Tan, Yang
    Xie, Liping
    Jing, Shicheng
    Fang, Shixiong
    Zhang, Kanjian
    Knowledge-Based Systems, 2024, 304
  • [14] Neural representation and learning for multi-view human action recognition
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [15] Learning Multi-View Interactional Skeleton Graph for Action Recognition
    Wang, Minsi
    Ni, Bingbing
    Yang, Xiaokang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 6940 - 6954
  • [16] Multi-View Action Recognition by Cross-domain Learning
    Nie, Weizhi
    Liu, Anan
    Yu, Jing
    Su, Yuting
    Chaisorn, Lekha
    Wang, Yongkang
    Kankanhalli, Mohan S.
    2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,
  • [17] Jointly Learning Multi-view Features for Human Action Recognition
    Wang, Ruoshi
    Liu, Zhigang
    Yin, Ziyang
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 4858 - 4861
  • [18] Discriminative Multi-View Subspace Feature Learning for Action Recognition
    Sheng, Biyun
    Li, Jun
    Xiao, Fu
    Li, Qun
    Yang, Wankou
    Han, Junwei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (12) : 4591 - 4600
  • [19] Vehicle Recognition Based on Carrier-Free UWB Radars Using Contrastive Multi-View Learning
    Zhu, Yuying
    Zhang, Shuning
    Chen, Si
    IEEE MICROWAVE AND WIRELESS TECHNOLOGY LETTERS, 2023, 33 (03): : 343 - 346
  • [20] Orchard bird song recognition based on multi-view multi-level contrastive learning
    Wu, Wei
    Zhang, Ruiyan
    Zheng, Xinyue
    Fang, Minghui
    Ma, Tianyuan
    Hu, Qichang
    Kong, Xiangzeng
    Zhao, Chen
    APPLIED ACOUSTICS, 2024, 224