Learning Self- and Cross-Triplet Context Clues for Human-Object Interaction Detection

被引:0
|
作者
Ren, Weihong [1 ,2 ]
Luo, Jinguo [1 ]
Jiang, Weibo [1 ]
Qu, Liangqiong [3 ]
Han, Zhi [2 ]
Tian, Jiandong [2 ]
Liu, Honghai [1 ]
机构
[1] Harbin Institute of Technology, State Key Laboratory of Robotics and Systems, School of Mechanical Engineering and Automation, Shenzhen,518055, China
[2] Shenyang Institute of Automation, Chinese Academy of Sciences, State Key Laboratory of Robotics, Shenyang,110016, China
[3] The University of Hong Kong, Department of Statistics and Actuarial Science, Hong Kong, Hong Kong
关键词
D O I
10.1109/TCSVT.2024.3402247
中图分类号
学科分类号
摘要
Human-Object Interaction (HOI) detection aims to infer interactions between humans and objects, and it is very important for scene analysis and understanding. The existing methods usually focus on exploring instance-level (e.g., object appearance) or interaction-level (e.g., action semantic) features to conduct interaction prediction. However, most of these methods only consider the self-triplet feature aggregation, which may lead to learning ambiguity without exploring the cross-triplet context exchange. In this paper, from both visual and textual perspectives, we propose a novel method to jointly explore self- and cross-triplet interaction context clues for HOI detection. First, we employ a graph neural network to perform self-triplet aggregation, where human and object features represent graph nodes and visual interaction feature and textual prior knowledge are acted as two different edges. Furthermore, we also attempt to explore cross-triplet context exchange by incorporating symbiotic and layout relationships among different HOI triplets. Extensive experiments on two benchmarks demonstrate that our proposed method outperforms the state-of-the-art ones and achieves the impressive performance of 40.32 mAP on HICO-DET and 69.1 mAP on V-COCO datasets, respectively. © 1991-2012 IEEE.
引用
收藏
页码:9760 / 9773
相关论文
共 50 条
  • [1] Relational Context Learning for Human-Object Interaction Detection
    Kim, Sanghyun
    Jung, Deunsol
    Cho, Minsu
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2925 - 2934
  • [2] Segmenting Key Clues to Induce Human-Object Interaction Detection
    Xue, Mingliang
    Wang, Siwei
    Fu, Bing
    Zhao, Zhengyang
    Liu, Tao
    Lai, Lingfeng
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 60 - 71
  • [3] Human-object interaction detection with depth-augmented clues
    Cheng, Yamin
    Duan, Hancong
    Wang, Chen
    Wang, Zhi
    [J]. NEUROCOMPUTING, 2022, 500 : 978 - 988
  • [4] Lifelong Learning for Human-Object Interaction Detection
    Sun, Bo
    Lu, Sixu
    He, Jun
    Yu, Lejun
    [J]. 2022 IEEE 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2022), 2022, : 582 - 587
  • [5] Learning Human-Object Interaction Detection using Interaction Points
    Wang, Tiancai
    Yang, Tong
    Danelljan, Martin
    Khan, Fahad Shahbaz
    Zhang, Xiangyu
    Sun, Jian
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4115 - 4124
  • [6] Affordance Transfer Learning for Human-Object Interaction Detection
    Hou, Zhi
    Yu, Baosheng
    Qiao, Yu
    Peng, Xiaojiang
    Tao, Dacheng
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 495 - 504
  • [7] Learning Human-Object Interaction Detection via Deformable Transformer
    Cai, Shuang
    Ma, Shiwei
    Gu, Dongzhou
    [J]. 2021 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2021, 12076
  • [8] A Survey of Human-Object Interaction Detection
    Gong X.
    Zhang Z.
    Liu L.
    Ma B.
    Wu K.
    [J]. Xinan Jiaotong Daxue Xuebao/Journal of Southwest Jiaotong University, 2022, 57 (04): : 693 - 704
  • [9] Human-Object Interaction Detection: An Overview
    Wang, Jia
    Shuai, Hong-Han
    Li, Yung-Hui
    Cheng, Wen-Huang
    [J]. IEEE Consumer Electronics Magazine, 2024, 13 (06) : 56 - 72
  • [10] Compositional Learning in Transformer-Based Human-Object Interaction Detection
    Zhuang, Zikun
    Qian, Ruihao
    Xie, Chi
    Liang, Shuang
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1038 - 1043