Video-level Multi-model Fusion for Action Recognition

被引:3
|
作者
Wang, Xiaomin [1 ]
Zhang, Junsan [1 ]
Wang, Leiquan [1 ]
Yu, Philip S. [2 ]
Zhu, Jie [3 ]
Li, Haisheng [4 ]
机构
[1] China Univ Petr EastChina, Coll Comp Sci & Technol, Qingdao, Shandong, Peoples R China
[2] Univ Illinois, Dept Comp Sci, Chicago, IL 60680 USA
[3] Natl Police Univ Criminal Justice, Dept Informat Management, Hangzhou, Peoples R China
[4] Beijing Technol & Business Univ, Beijing Key Lab Big Data Technol Food Safety, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
action recognition; video-leval recognition; 3D convolution; multi-model fusion;
D O I
10.1145/3357384.3357935
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The approaches based on spatio-temporal features for video action recognition have emerged such as two-stream based methods and 3D convolution based methods. However, current methods suffer from the problems caused by partial observation, or restricted to single information modeling, and so on. Segment-level recognition results obtained from dense sampling can not represent the entire video and, therefore lead to partial observation. And a single model is hard to capture the complementary information on spacial, temporal and spatio-temporal information from video at the same time. Therefore, the challenge is to build the video-level representation and capture multiple information. In this paper, a video-level multi-model fusion action recognition method is proposed to solve these problems. Firstly, an efficient video-level 3D convolution model is proposed to get the global information in the video which assembling segment-level 3D convolution models. Secondly, a multi-model fusion architecture is proposed for video action recognition to capture multiple information. The spatial, temporal and spatio-temporal information are aggregate with SVM classifier. Experimental results show that this method achieves the state-of-the-art performance on the datasets of UCF-101(97.6%) without pre-training on Kinetics.
引用
收藏
页码:159 / 168
页数:10
相关论文
共 50 条
  • [31] Leukocyte subtype classification with multi-model fusion
    Ding, Yingying
    Tang, Xuehui
    Zhuang, Yuan
    Mu, Junjie
    Chen, Shuchao
    Liu, Shanshan
    Feng, Sihao
    Chen, Hongbo
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2023, 61 (09) : 2305 - 2316
  • [32] Multi-model data fusion for hydrological forecasting
    See, L
    Abrahart, RJ
    COMPUTERS & GEOSCIENCES, 2001, 27 (08) : 987 - 994
  • [33] Multi-model Lightweight Action Recognition with Group-Shuffle Graph Convolutional Network
    Zhu, Suguo
    Zhan, Yibing
    Zhao, Guo
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 609 - 621
  • [34] 3D CONVOLUTIONAL NEURAL NETWORK WITH MULTI-MODEL FRAMEWORK FOR ACTION RECOGNITION
    Jing, Longlong
    Ye, Yuancheng
    Yang, Xiaodong
    Tian, Yingli
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1837 - 1841
  • [35] Leukocyte subtype classification with multi-model fusion
    Yingying Ding
    Xuehui Tang
    Yuan Zhuang
    Junjie Mu
    Shuchao Chen
    Shanshan Liu
    Sihao Feng
    Hongbo Chen
    Medical & Biological Engineering & Computing, 2023, 61 : 2305 - 2316
  • [36] Fast Fusion Moves for Multi-model Estimation
    Delong, Andrew
    Veksler, Olga
    Boykov, Yuri
    COMPUTER VISION - ECCV 2012, PT I, 2012, 7572 : 370 - 384
  • [37] Multi-model fusion and error parameter estimation
    Logutov, O. G.
    Robinson, A. R.
    QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, 2005, 131 (613) : 3397 - 3408
  • [38] Multi-Level Deep Learning Depth and Color Fusion for Action Recognition
    Zelensky, A.
    Voronin, V.
    Zhdanova, M.
    Gapon, N.
    Tokareva, O.
    Semenishchev, E.
    OPTICS, PHOTONICS AND DIGITAL TECHNOLOGIES FOR IMAGING APPLICATIONS VII, 2022, 12138
  • [39] Spatio-temporal Multi-level Fusion for Human Action Recognition
    Manh-Hung Lu
    Thi-Oanh Nguyen
    SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 298 - 305
  • [40] Few-Shot Action Recognition in Video Based on Multi-Feature Fusion
    Pu Z.-X.
    Ge Y.
    Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (03): : 594 - 608