Learning discriminative motion feature for enhancing multi-modal action recognition

被引:0
|
作者
Yang, Jianyu [1 ]
Huang, Yao [1 ]
Shao, Zhanpeng [2 ]
Liu, Chunping [3 ]
机构
[1] School of Rail Transportation, Soochow University, Suzhou,215000, China
[2] School of Computer Science and Technology, Zhejiang University of Technology, Hangzhou,310023, China
[3] School of Computer Science and Technology, Soochow University, Suzhou,215000, China
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Video action recognition is an important topic in computer vision tasks. Most of the existing methods use CNN-based models, and multiple modalities of image features are captured from the videos, such as static frames, dynamic images, and optical flow features. However, these mainstream features contain much static information including object and background information, where the motion information of the action itself is not distinguished and strengthened. In this work, a new kind of motion feature is proposed without static information for video action recognition. We propose a quantization of motion network based on the bag-of-feature method to learn significant and discriminative motion features. In the learned feature map, the object and background information is filtered out, even if the background is moving in the video. Therefore, the motion feature is complementary to the static image feature and the static information in the dynamic image and optical flow. A multi-stream classifier is built with the proposed motion feature and other features, and the performance of action recognition is enhanced comparing to other state-of-the-art methods. © 2021 Elsevier Inc.
引用
收藏
相关论文
共 50 条
  • [21] Multi-modal deep learning for landform recognition
    Du, Lin
    You, Xiong
    Li, Ke
    Meng, Liqiu
    Cheng, Gong
    Xiong, Liyang
    Wang, Guangxia
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2019, 158 : 63 - 75
  • [22] Hybrid Multi-modal Fusion for Human Action Recognition
    Seddik, Bassem
    Gazzah, Sami
    Ben Amara, Najoua Essoukri
    IMAGE ANALYSIS AND RECOGNITION, ICIAR 2017, 2017, 10317 : 201 - 209
  • [23] Multi-modal Transformer for Indoor Human Action Recognition
    Do, Jeonghyeok
    Kim, Munchurl
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 1155 - 1160
  • [24] Multi-modal and multi-layout discriminative learning for placental maturity staging
    Lei, Baiying
    Li, Wanjun
    Yao, Yuan
    Jiang, Xudong
    Tan, Ee-Leng
    Qin, Jing
    Chen, Siping
    Ni, Dong
    Wang, Tianfu
    PATTERN RECOGNITION, 2017, 63 : 719 - 730
  • [25] Common and Discriminative Semantic Pursuit for Multi-Modal Multi-Label Learning
    Zhang, Yi
    Shen, Jundong
    Zhang, Zhecheng
    Wang, Chongjun
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 1666 - 1673
  • [26] A multi-modal approach for high-dimensional feature recognition
    Kushan Ahmadian
    Marina Gavrilova
    The Visual Computer, 2013, 29 : 123 - 130
  • [27] A multi-modal approach for high-dimensional feature recognition
    Ahmadian, Kushan
    Gavrilova, Marina
    VISUAL COMPUTER, 2013, 29 (02): : 123 - 130
  • [28] Multi-modal nonlinear feature reduction for the recognition of handwritten numerals
    Zhang, P
    Suen, CY
    Bui, TD
    1ST CANADIAN CONFERENCE ON COMPUTER AND ROBOT VISION, PROCEEDINGS, 2004, : 393 - 400
  • [29] Sports action recognition algorithm based on multi-modal data recognition
    Zhang, Lin
    Intelligent Decision Technologies, 2024, 18 (04) : 3243 - 3257
  • [30] MMVSL: A multi-modal visual semantic learning method for pig pose and action recognition
    Guan, Zhibin
    Chai, Xiujuan
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 229