Deep learning and RGB-D based human action, human-human and human-object interaction recognition: A survey?

被引:0
|
作者
Khaire, Pushpajit [1 ]
Kumar, Praveen [1 ]
机构
[1] Visvesvaraya Natl Inst Technol, Dept Comp Sci & Engn, Nagpur, India
关键词
Human action recognition; CNN; LSTM; Human-human interaction; Human-object interaction; Deep learning; RGB-D sensors; Multi-modality; Fusion; Skeleton; GCN; FLOW ESTIMATION; NEURAL-NETWORK; SEQUENCES; STREAMS;
D O I
10.1016/j.jvcir.2022.103531
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human activity recognition is one of the most studied topics in the field of computer vision. In recent years, with the availability of RGB-D sensors and powerful deep learning techniques, research on human activity recognition has gained momentum. From simple human atomic actions, the research has advanced towards recognizing more complex human activities using RGB-D data. This paper presents a comprehensive survey of the advanced deep learning based recognition methods and categorizes them in human atomic action, human-human interaction, human-object interaction. The reviewed methods are further classified based on the individual modality used for recognition i.e. RGB based, depth based, skeleton based, and hybrid. We also review and categorize recent challenging RGB-D datasets for the same. In addition, the paper also briefly reviews RGB-D datasets and methods for online activity recognition. The paper concludes with a discussion on limitations, challenges, and recent trends for promising future directions.
引用
收藏
页数:25
相关论文
共 50 条
  • [31] Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition
    Liu, Jian
    Rahmani, Hossein
    Akhtar, Naveed
    Mian, Ajmal
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (10) : 1545 - 1564
  • [32] Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition
    Jian Liu
    Hossein Rahmani
    Naveed Akhtar
    Ajmal Mian
    [J]. International Journal of Computer Vision, 2019, 127 : 1545 - 1564
  • [33] HOIsim: Synthesizing Realistic 3D Human-Object Interaction Data for Human Activity Recognition
    Zakour, Marsil
    Mellouli, Alaeddine
    Chaudhari, Rahul
    [J]. 2021 30TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2021, : 1124 - 1131
  • [34] Deep Contextual Attention for Human-Object Interaction Detection
    Wang, Tiancai
    Anwer, Rao Muhammad
    Khan, Muhammad Haris
    Khan, Fahad Shahbaz
    Pang, Yanwei
    Shao, Ling
    Laaksonen, Jorma
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5693 - 5701
  • [35] A survey on Deep learning techniques for human action recognition
    Karthickkumar, S.
    Kumar, K.
    [J]. 2020 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI - 2020), 2020, : 51 - +
  • [36] Turbo Learning Framework for Human-Object Interactions Recognition and Human Pose Estimation
    Feng, Wei
    Liu, Wentao
    Li, Tong
    Peng, Jing
    Qian, Chen
    Hu, Xiaolin
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 898 - 905
  • [37] Survey on deep learning methods in human action recognition
    Koohzadi, Maryam
    Charkari, Nasrollah Moghadam
    [J]. IET COMPUTER VISION, 2017, 11 (08) : 623 - 632
  • [38] A methodology for semantic action recognition based on pose and human-object interaction in avocado harvesting processes
    Vasconez, J. P.
    Admoni, H.
    Auat Cheein, F.
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 184
  • [39] ReadingAct RGB-D action dataset and human action recognition from local features
    Chen, Lulu
    Wei, Hong
    Ferryman, James
    [J]. PATTERN RECOGNITION LETTERS, 2014, 50 : 159 - 169
  • [40] Deep sensorimotor learning for RGB-D object recognition
    Thermos, Spyridon
    Papadopoulos, Georgios Th.
    Daras, Petros
    Potamianos, Gerasimos
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2020, 190