Learning discriminative motion feature for enhancing multi-modal action recognition

被引:0
|
作者
Yang, Jianyu [1 ]
Huang, Yao [1 ]
Shao, Zhanpeng [2 ]
Liu, Chunping [3 ]
机构
[1] School of Rail Transportation, Soochow University, Suzhou,215000, China
[2] School of Computer Science and Technology, Zhejiang University of Technology, Hangzhou,310023, China
[3] School of Computer Science and Technology, Soochow University, Suzhou,215000, China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Video action recognition is an important topic in computer vision tasks. Most of the existing methods use CNN-based models, and multiple modalities of image features are captured from the videos, such as static frames, dynamic images, and optical flow features. However, these mainstream features contain much static information including object and background information, where the motion information of the action itself is not distinguished and strengthened. In this work, a new kind of motion feature is proposed without static information for video action recognition. We propose a quantization of motion network based on the bag-of-feature method to learn significant and discriminative motion features. In the learned feature map, the object and background information is filtered out, even if the background is moving in the video. Therefore, the motion feature is complementary to the static image feature and the static information in the dynamic image and optical flow. A multi-stream classifier is built with the proposed motion feature and other features, and the performance of action recognition is enhanced comparing to other state-of-the-art methods. © 2021 Elsevier Inc.
引用
收藏
相关论文
共 50 条
  • [1] Learning discriminative motion feature for enhancing multi-modal action
    Yang, Jianyu
    Huang, Yao
    Shao, Zhanpeng
    Liu, Chunping
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 79
  • [2] Discriminative Multi-modal Feature Fusion for RGBD Indoor Scene Recognition
    Zhu, Hongyuan
    Weibel, Jean-Baptiste
    Lu, Shijian
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2969 - 2976
  • [3] Multi-modal Pyramid Feature Combination for Human Action Recognition
    Roig, Carlos
    Sarmiento, Manuel
    Varas, David
    Masuda, Issey
    Riveiro, Juan Carlos
    Bou-Balust, Elisenda
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3742 - 3746
  • [4] A discriminative multi-modal adaptation neural network model for video action recognition
    Gao, Lei
    Liu, Kai
    Guan, Ling
    Neural Networks, 2025, 185
  • [5] A Discriminative Vectorial Framework for Multi-Modal Feature Representation
    Gao, Lei
    Guan, Ling
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1503 - 1514
  • [6] SKELETON-INDEXED DEEP MULTI-MODAL FEATURE LEARNING FOR HIGH PERFORMANCE HUMAN ACTION RECOGNITION
    Song, Sijie
    Lan, Cuiling
    Xing, Junliang
    Zeng, Wenjun
    Liu, Jiaying
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [7] Discriminative Multi-View Subspace Feature Learning for Action Recognition
    Sheng, Biyun
    Li, Jun
    Xiao, Fu
    Li, Qun
    Yang, Wankou
    Han, Junwei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (12) : 4591 - 4600
  • [8] MULTI-MODAL FEATURE FUSION FOR ACTION RECOGNITION IN RGB-D SEQUENCES
    Shahroudy, Amir
    Wang, Gang
    Ng, Tian-Tsong
    2014 6TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS, CONTROL AND SIGNAL PROCESSING (ISCCSP), 2014, : 73 - 76
  • [9] MULTI-MODAL LEARNING FOR GESTURE RECOGNITION
    Cao, Congqi
    Zhang, Yifan
    Lu, Hanqing
    2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2015,
  • [10] Learning Multi-modal Densities on Discriminative Temporal Interaction Manifold for Group Activity Recognition
    Li, Ruonan
    Chellappa, Rama
    Zhou, Shaohua Kevin
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 2442 - +