Recognizing actions in images by fusing multiple body structure cues

被引：13

作者：

Li, Yang ^{[1
]}

Li, Kan ^{[1
]}

Wang, Xinxin ^{[1
]}

机构：

[1] Beijing Inst Technol, 5 South Zhongguancun St, Beijing 100081, Peoples R China

来源：

PATTERN RECOGNITION | 2020年 / 104卷

基金：

北京市自然科学基金;

关键词：

Image-based action recognition; Convolutional neural network; Body structure cues;

D O I：

10.1016/j.patcog.2020.107341

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although Convolutional Neural Networks (CNNs) have made substantial improvements in many computer vision tasks, there remains room for improvements in image-based action recognition due to the limited capability to exploit the body structure information.In this work, we propose a unified deep model to explicitly explore body structure information and fuse multiple body structure cues for robust action recognition in images.In order to fully explore the body structure information, we design the Body Structure Exploration sub-network.It generates two novel body structure cues, Structural Body Parts and Limb Angle Descriptor, which capture structure information of human bodies from the global and local perspectives respectively. And then, we design the Action Classification sub-network to fuse the predictions from multiple body structure cues to obtain precise results. Moreover, we integrate the two sub-networks into a unified model by sharing the bottom convolutional layers, which improves the computational efficiency in both training and testing stages. We comprehensively evaluate our network on the challenging image-based human action datasets, Pascal VOC 2012 Action and Stanford40. Our approach achieves 93.5% and 93.8% mAP respectively, which outperforms all recent approaches in this field. (C) 2020 Elsevier Ltd. All rights reserved.

引用

页数：12

共 50 条

[1] On Recognizing Actions in Still Images via Multiple Features
Sener, Fadime
Bas, Cagdas
Ikizler-Cinbis, Nazli
COMPUTER VISION - ECCV 2012, PT III, 2012, 7585 : 263 - 272
[2] Recognizing Actions from Still Images
Ikizler, Nazli
Cinbis, R. Cokberk
Pehlivan, Selen
Duygulu, Pinar
19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 1299 - 1302
[3] Fusing Multiple Multiband Images
Arablouei, Reza
JOURNAL OF IMAGING, 2018, 4 (10)
[4] Driver fatigue detection by fusing multiple cues
Senaratne, Rajinda
Hardy, David
Vanderaa, Bill
Halgamuge, Saman
ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 2, PROCEEDINGS, 2007, 4492 : 801 - +
[5] Recognizing Human Actions From Still Images
Kilickaya, Mert
Telatar, Ziya
2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
[6] Leveraging multiple cues for recognizing family photos
Wang, Xiaolong
Guo, Guodong
Merler, Michele
Codella, Noel C. F.
Rohith, M., V
Smith, John R.
Kambhamettu, Chandra
IMAGE AND VISION COMPUTING, 2017, 58 : 61 - 75
[7] Learning and fusing multiple cues for indoor video segmentation
Shi, Chunlei
Yang, Wenjia
Chai, Zhi
MIPPR 2013: PATTERN RECOGNITION AND COMPUTER VISION, 2013, 8919
[8] Fusing multiple images with evidential reasoning
Yuan, XH
Jian, Z
Buckles, BP
MULTISENSOR, MULTISOURCE INFORMATION FUSION: ARCHITECTURES, ALGORITHMS, AND APPLICATIONS 2003, 2003, 5099 : 58 - 64
[9] RECOGNIZING HUMAN ACTIONS BY FUSING SPATIO-TEMPORAL APPEARANCE AND MOTION DESCRIPTORS
Ballan, Lamberto
Bertini, Marco
Del Bimbo, Alberto
Seidenari, Lorenzo
Serra, Giuseppe
2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 3569 - 3572
[10] Recognizing human actions using multiple features
Liu, Jingen
Ali, Saad
Shah, Mubarak
2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 1436 - 1443

← 1 2 3 4 5 →