Simultaneous Utilization of Inertial and Video Sensing for Action Detection and Recognition in Continuous Action Streams

被引：19

作者：

Wei, Haoran ^{[1
]}

Kehtarnavaz, Nasser ^{[1
]}

机构：

[1] Univ Texas Dallas, Dept Elect & Comp Engn, Richardson, TX 75080 USA

来源：

IEEE SENSORS JOURNAL | 2020年 / 20卷 / 11期

关键词：

Sports; Acceleration; Cameras; Image segmentation; Streaming media; Action detection and recognition in continuous action streams; simultaneous utilization of video and inertialsensing; deep learning-based continuous action detection and recognition; CLASSIFICATION; DEPTH; FUSION;

D O I：

10.1109/JSEN.2020.2973361

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper describes the simultaneous utilization of inertial and video sensing for the purpose of achieving human action detection and recognition in continuous action streams. Continuous action streams mean that actions of interest are performed randomly among actions of non-interest in a continuous manner. The inertial and video data are captured simultaneously via a wearable inertial sensor and a video camera, which are turned into 2D and 3D images. These images are then fed into a 2D and a 3D convolutional neural network with their decisions fused in order to detect and recognize a specified set of actions of interest from continuous action streams. The developed fusion approach is applied to two sets of actions of interest consisting of smart TV gestures and sports actions. The results obtained indicate the fusion approach is more effective than when each sensing modality is used individually. The average accuracy of the fusion approach is found to be 5.8% above inertial and 14.3% above video for the TV gesture actions of interest, and 23.2% above inertial and 1.9% above video for the sports actions of interest.

引用

页码：6055 / 6063

页数：9

共 50 条

[31] Action boundaries detection in a video
Hassan Wehbe
Bassem Haidar
Philippe Joly
Multimedia Tools and Applications, 2016, 75 : 8239 - 8266
[32] Action boundaries detection in a video
Wehbe, Hassan
Haidar, Bassem
Joly, Philippe
MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (14) : 8239 - 8266
[33] TQVS: Temporal Queries over Video Streams in Action
Chen, Yueting
Yu, Xiaohui
Koudas, Nick
SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 2737 - 2740
[34] Data Augmentation in Deep Learning-Based Fusion of Depth and Inertial Sensing for Action Recognition
Dawar, Neha
Ostadabbas, Sarah
Kehtarnavaz, Nasser
IEEE SENSORS LETTERS, 2019, 3 (01)
[35] Automatic Action Segmentation and Continuous Recognition for Basic Indoor Actions Based on Kinect Pose Streams
Han, Yun
Chung, Sheng-Luen
Su, Shun-Feng
2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 966 - 971
[36] Action-Stage Emphasized Spatiotemporal VLAD for Video Action Recognition
Tu, Zhigang
Li, Hongyan
Zhang, Dejun
Dauwels, Justin
Li, Baoxin
Yuan, Junsong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) : 2799 - 2812
[37] Similar gait action recognition using an inertial sensor
Trung Thanh Ngo
Makihara, Yasushi
Nagahara, Hajime
Mukaigawa, Yasuhiro
Yagi, Yasushi
PATTERN RECOGNITION, 2015, 48 (04) : 1289 - 1301
[38] Sports Action Recognition and Analysis Relying on Inertial Sensors
Liu, Yutong
Zhou, Zunliang
Qian, Xiaolong
Chen, Jiaming
JOURNAL OF SENSORS, 2022, 2022
[39] Fusion of spatial and dynamic CNN streams for action recognition
Newlin Shebiah Russel
Arivazhagan Selvaraj
Multimedia Systems, 2021, 27 : 969 - 984
[40] Fusion of spatial and dynamic CNN streams for action recognition
Russel, Newlin Shebiah
Selvaraj, Arivazhagan
MULTIMEDIA SYSTEMS, 2021, 27 (05) : 969 - 984

← 1 2 3 4 5 →