Enhancing Recognition of Human-Object Interaction from Visual Data Using Egocentric Wearable Camera

被引:0
|
作者
Hamid, Danish [1 ]
Ul Haq, Muhammad Ehatisham [1 ]
Yasin, Amanullah [1 ]
Murtaza, Fiza [1 ]
Azam, Muhammad Awais [2 ]
机构
[1] Air Univ, Fac Comp & Artificial Intelligence FCAI, Dept Creat Technol, Islamabad 44000, Pakistan
[2] Whitecliffe, Technol & Innovat Res Grp, Sch Informat Technol, Wellington 6145, New Zealand
关键词
egocentric; hand pose; human-object interaction; machine learning; object recognition; wearable camera;
D O I
10.3390/fi16080269
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection and human action recognition have great significance in many real-world applications. Understanding how a human being interacts with different objects, i.e., human-object interaction, is also crucial in this regard since it enables diverse applications related to security, surveillance, and immersive reality. Thus, this study explored the potential of using a wearable camera for object detection and human-object interaction recognition, which is a key technology for the future Internet and ubiquitous computing. We propose a system that uses an egocentric camera view to recognize objects and human-object interactions by analyzing the wearer's hand pose. Our novel idea leverages the hand joint data of the user, which were extracted from the egocentric camera view, for recognizing different objects and related interactions. Traditional methods for human-object interaction rely on a third-person, i.e., exocentric, camera view by extracting morphological and color/texture-related features, and thus, often fall short when faced with occlusion, camera variations, and background clutter. Moreover, deep learning-based approaches in this regard necessitate substantial data for training, leading to a significant computational overhead. Our proposed approach capitalizes on hand joint data captured from an egocentric perspective, offering a robust solution to the limitations of traditional methods. We propose a machine learning-based innovative technique for feature extraction and description from 3D hand joint data by presenting two distinct approaches: object-dependent and object-independent interaction recognition. The proposed method offered advantages in computational efficiency compared with deep learning methods and was validated using the publicly available HOI4D dataset, where it achieved a best-case average F1-score of 74%. The proposed system paves the way for intuitive human-computer collaboration within the future Internet, enabling applications like seamless object manipulation and natural user interfaces for smart devices, human-robot interactions, virtual reality, and augmented reality.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Egocentric Human-Object Interaction Detection Exploiting Synthetic Data
    Leonardi, Rosario
    Ragusa, Francesco
    Furnari, Antonino
    Farinella, Giovanni Maria
    [J]. IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT II, 2022, 13232 : 237 - 248
  • [2] Cascaded Human-Object Interaction Recognition
    Zhou, Tianfei
    Wang, Wenguan
    Qi, Siyuan
    Ling, Haibin
    Shen, Jianbing
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4262 - 4271
  • [3] Exploiting multimodal synthetic data for egocentric human-object interaction detection in an industrial scenario
    Leonardi, Rosario
    Ragusa, Francesco
    Furnari, Antonino
    Farinella, Giovanni Maria
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 242
  • [4] Human-object Interaction Recognition Using Multitask Neural Network
    Yan, Weihao
    Gao, Yue
    Liu, Qiming
    [J]. 2019 3RD INTERNATIONAL SYMPOSIUM ON AUTONOMOUS SYSTEMS (ISAS 2019), 2019, : 323 - 328
  • [5] Egocentric visual scene description based on human-object interaction and deep spatial relations among objects
    Khan, Gulraiz
    Ghani, Muhammad Usman
    Siddiqi, Aiman
    Zahoor-ur-Rehman
    Seo, Sanghyun
    Baik, Sung Wook
    Mehmood, Irfan
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (23-24) : 15859 - 15880
  • [6] Cascaded Parsing of Human-Object Interaction Recognition
    Zhou, Tianfei
    Qi, Siyuan
    Wang, Wenguan
    Shen, Jianbing
    Zhu, Song-Chun
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (06) : 2827 - 2840
  • [7] Egocentric visual scene description based on human-object interaction and deep spatial relations among objects
    Gulraiz Khan
    Muhammad Usman Ghani
    Aiman Siddiqi
    Sanghyun Zahoor-ur-Rehman
    Sung Wook Seo
    Irfan Baik
    [J]. Multimedia Tools and Applications, 2020, 79 : 15859 - 15880
  • [8] Visibility Aware Human-Object Interaction Tracking from Single RGB Camera
    Xie, Xianghui
    Bhatnagar, Bharat Lal
    Pons-Moll, Gerard
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 4757 - 4768
  • [9] Human-Object Interaction Recognition Based on Modeling Context
    Shuyang Li
    Wei Liang
    Qun Zhang
    [J]. Journal of Beijing Institute of Technology, 2017, 26 (02) : 215 - 222
  • [10] Human-Object Interaction Recognition Based on Modeling Context
    Li, Shuyang
    Liang, Wei
    Zhang, Qun
    [J]. Journal of Beijing Institute of Technology (English Edition), 2017, 26 (02): : 215 - 222