Human-object interaction detection with depth-augmented clues

被引:2
|
作者
Cheng, Yamin [1 ]
Duan, Hancong [1 ]
Wang, Chen [1 ]
Wang, Zhi [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Human -object interaction; Depth map; NETWORK; ATTENTION; GENERATION;
D O I
10.1016/j.neucom.2022.05.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human object interaction (HOI) detection aims to localize and classify triplets of human, object and relationship from a given image. Different from previous methods that only extract vision information in RGB images, we propose a Depth-augmented Relationship Reasoning (DRR) method that focuses on the RGB images and corresponding depth messages simultaneously. Rethinking principles of photography, we argue that RGB images discard spatial depth carrying third dimension relative distance information between instances. In light of this, we beforehand estimate the depth information for each image, yielding a corresponding depth map. Then we leverage multiple representations encoded by depth information and RGB images to enrich semantic interpretation. Subsequently, we explore a hierarchical attention strategy to fuse these semantic representations and further generate depth-augmented features, being used to reason about fine-grained human-object interactions. Extensive experiments on the benchmark datasets V-COCO, HICO-DET and HCVRD verify the effectiveness of our method and demonstrate the importance of spatial depth information for HOI.
引用
收藏
页码:978 / 988
页数:11
相关论文
共 50 条
  • [41] Rethinking vision transformer through human-object interaction detection
    Cheng, Yamin
    Zhao, Zitian
    Wang, Zhi
    Duan, Hancong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 122
  • [42] Egocentric Human-Object Interaction Detection Exploiting Synthetic Data
    Leonardi, Rosario
    Ragusa, Francesco
    Furnari, Antonino
    Farinella, Giovanni Maria
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT II, 2022, 13232 : 237 - 248
  • [43] Effective actor-centric human-object interaction detection
    Xu, Kunlun
    Li, Zhimin
    Zhang, Zhijun
    Dong, Leizhen
    Xu, Wenhui
    Yan, Luxin
    Zhong, Sheng
    Zou, Xu
    IMAGE AND VISION COMPUTING, 2022, 121
  • [44] Mask-Guided Transformer for Human-Object Interaction Detection
    Ying, Daocheng
    Yang, Hua
    Sun, Jun
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [45] Improved human-object interaction detection through skeleton-object relations
    Zhang, Hong-Bo
    Zhou, Yi-Zhong
    Du, Ji-Xiang
    Huang, Jin-Long
    Lei, Qing
    Yang, Lijie
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (01) : 41 - 52
  • [46] Learning Human-Object Interaction Detection via Deformable Transformer
    Cai, Shuang
    Ma, Shiwei
    Gu, Dongzhou
    2021 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2021, 12076
  • [47] Relation Parsing Neural Network for Human-Object Interaction Detection
    Zhou, Penghao
    Chi, Mingmin
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 843 - 851
  • [48] Object Centric Body Part Attention Network for Human-Object Interaction Detection
    Liu, Zhuang
    Zhang, Xiaowei
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 378 - 391
  • [49] Chairs Can Be Stood On: Overcoming Object Bias in Human-Object Interaction Detection
    Wang, Guangzhi
    Guo, Yangyang
    Wong, Yongkang
    Kankanhalli, Mohan
    COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 654 - 672
  • [50] Pairwise Negative Sample Mining for Human-Object Interaction Detection
    Jia, Weizhe
    Ma, Shiwei
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII, 2024, 14431 : 425 - 437