Simultaneous Utilization of Inertial and Video Sensing for Action Detection and Recognition in Continuous Action Streams

被引:19
|
作者
Wei, Haoran [1 ]
Kehtarnavaz, Nasser [1 ]
机构
[1] Univ Texas Dallas, Dept Elect & Comp Engn, Richardson, TX 75080 USA
关键词
Sports; Acceleration; Cameras; Image segmentation; Streaming media; Action detection and recognition in continuous action streams; simultaneous utilization of video and inertialsensing; deep learning-based continuous action detection and recognition; CLASSIFICATION; DEPTH; FUSION;
D O I
10.1109/JSEN.2020.2973361
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper describes the simultaneous utilization of inertial and video sensing for the purpose of achieving human action detection and recognition in continuous action streams. Continuous action streams mean that actions of interest are performed randomly among actions of non-interest in a continuous manner. The inertial and video data are captured simultaneously via a wearable inertial sensor and a video camera, which are turned into 2D and 3D images. These images are then fed into a 2D and a 3D convolutional neural network with their decisions fused in order to detect and recognize a specified set of actions of interest from continuous action streams. The developed fusion approach is applied to two sets of actions of interest consisting of smart TV gestures and sports actions. The results obtained indicate the fusion approach is more effective than when each sensing modality is used individually. The average accuracy of the fusion approach is found to be 5.8% above inertial and 14.3% above video for the TV gesture actions of interest, and 23.2% above inertial and 1.9% above video for the sports actions of interest.
引用
收藏
页码:6055 / 6063
页数:9
相关论文
共 50 条
  • [31] Action boundaries detection in a video
    Hassan Wehbe
    Bassem Haidar
    Philippe Joly
    Multimedia Tools and Applications, 2016, 75 : 8239 - 8266
  • [32] Action boundaries detection in a video
    Wehbe, Hassan
    Haidar, Bassem
    Joly, Philippe
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (14) : 8239 - 8266
  • [33] TQVS: Temporal Queries over Video Streams in Action
    Chen, Yueting
    Yu, Xiaohui
    Koudas, Nick
    SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 2737 - 2740
  • [34] Data Augmentation in Deep Learning-Based Fusion of Depth and Inertial Sensing for Action Recognition
    Dawar, Neha
    Ostadabbas, Sarah
    Kehtarnavaz, Nasser
    IEEE SENSORS LETTERS, 2019, 3 (01)
  • [35] Automatic Action Segmentation and Continuous Recognition for Basic Indoor Actions Based on Kinect Pose Streams
    Han, Yun
    Chung, Sheng-Luen
    Su, Shun-Feng
    2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 966 - 971
  • [36] Action-Stage Emphasized Spatiotemporal VLAD for Video Action Recognition
    Tu, Zhigang
    Li, Hongyan
    Zhang, Dejun
    Dauwels, Justin
    Li, Baoxin
    Yuan, Junsong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) : 2799 - 2812
  • [37] Similar gait action recognition using an inertial sensor
    Trung Thanh Ngo
    Makihara, Yasushi
    Nagahara, Hajime
    Mukaigawa, Yasuhiro
    Yagi, Yasushi
    PATTERN RECOGNITION, 2015, 48 (04) : 1289 - 1301
  • [38] Sports Action Recognition and Analysis Relying on Inertial Sensors
    Liu, Yutong
    Zhou, Zunliang
    Qian, Xiaolong
    Chen, Jiaming
    JOURNAL OF SENSORS, 2022, 2022
  • [39] Fusion of spatial and dynamic CNN streams for action recognition
    Newlin Shebiah Russel
    Arivazhagan Selvaraj
    Multimedia Systems, 2021, 27 : 969 - 984
  • [40] Fusion of spatial and dynamic CNN streams for action recognition
    Russel, Newlin Shebiah
    Selvaraj, Arivazhagan
    MULTIMEDIA SYSTEMS, 2021, 27 (05) : 969 - 984