Recognizing actions in images by fusing multiple body structure cues

被引:13
|
作者
Li, Yang [1 ]
Li, Kan [1 ]
Wang, Xinxin [1 ]
机构
[1] Beijing Inst Technol, 5 South Zhongguancun St, Beijing 100081, Peoples R China
基金
北京市自然科学基金;
关键词
Image-based action recognition; Convolutional neural network; Body structure cues;
D O I
10.1016/j.patcog.2020.107341
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although Convolutional Neural Networks (CNNs) have made substantial improvements in many computer vision tasks, there remains room for improvements in image-based action recognition due to the limited capability to exploit the body structure information.In this work, we propose a unified deep model to explicitly explore body structure information and fuse multiple body structure cues for robust action recognition in images.In order to fully explore the body structure information, we design the Body Structure Exploration sub-network.It generates two novel body structure cues, Structural Body Parts and Limb Angle Descriptor, which capture structure information of human bodies from the global and local perspectives respectively. And then, we design the Action Classification sub-network to fuse the predictions from multiple body structure cues to obtain precise results. Moreover, we integrate the two sub-networks into a unified model by sharing the bottom convolutional layers, which improves the computational efficiency in both training and testing stages. We comprehensively evaluate our network on the challenging image-based human action datasets, Pascal VOC 2012 Action and Stanford40. Our approach achieves 93.5% and 93.8% mAP respectively, which outperforms all recent approaches in this field. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] On Recognizing Actions in Still Images via Multiple Features
    Sener, Fadime
    Bas, Cagdas
    Ikizler-Cinbis, Nazli
    COMPUTER VISION - ECCV 2012, PT III, 2012, 7585 : 263 - 272
  • [2] Recognizing Actions from Still Images
    Ikizler, Nazli
    Cinbis, R. Cokberk
    Pehlivan, Selen
    Duygulu, Pinar
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 1299 - 1302
  • [3] Fusing Multiple Multiband Images
    Arablouei, Reza
    JOURNAL OF IMAGING, 2018, 4 (10)
  • [4] Driver fatigue detection by fusing multiple cues
    Senaratne, Rajinda
    Hardy, David
    Vanderaa, Bill
    Halgamuge, Saman
    ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 2, PROCEEDINGS, 2007, 4492 : 801 - +
  • [5] Recognizing Human Actions From Still Images
    Kilickaya, Mert
    Telatar, Ziya
    2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
  • [6] Leveraging multiple cues for recognizing family photos
    Wang, Xiaolong
    Guo, Guodong
    Merler, Michele
    Codella, Noel C. F.
    Rohith, M., V
    Smith, John R.
    Kambhamettu, Chandra
    IMAGE AND VISION COMPUTING, 2017, 58 : 61 - 75
  • [7] Learning and fusing multiple cues for indoor video segmentation
    Shi, Chunlei
    Yang, Wenjia
    Chai, Zhi
    MIPPR 2013: PATTERN RECOGNITION AND COMPUTER VISION, 2013, 8919
  • [8] Fusing multiple images with evidential reasoning
    Yuan, XH
    Jian, Z
    Buckles, BP
    MULTISENSOR, MULTISOURCE INFORMATION FUSION: ARCHITECTURES, ALGORITHMS, AND APPLICATIONS 2003, 2003, 5099 : 58 - 64
  • [9] RECOGNIZING HUMAN ACTIONS BY FUSING SPATIO-TEMPORAL APPEARANCE AND MOTION DESCRIPTORS
    Ballan, Lamberto
    Bertini, Marco
    Del Bimbo, Alberto
    Seidenari, Lorenzo
    Serra, Giuseppe
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 3569 - 3572
  • [10] Recognizing human actions using multiple features
    Liu, Jingen
    Ali, Saad
    Shah, Mubarak
    2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 1436 - 1443