Learning Human-Object Interaction Detection using Interaction Points

被引:148
|
作者
Wang, Tiancai [1 ]
Yang, Tong [1 ]
Danelljan, Martin [2 ]
Khan, Fahad Shahbaz [3 ,4 ]
Zhang, Xiangyu [1 ]
Sun, Jian [1 ]
机构
[1] MEGVII Technol, Beijing, Peoples R China
[2] Swiss Fed Inst Technol, Zurich, Switzerland
[3] IIAI, Abu Dhabi, U Arab Emirates
[4] Linkoping Univ, Linkoping, Sweden
关键词
D O I
10.1109/CVPR42600.2020.00417
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding interactions between humans and objects is one of the fundamental problems in visual classification and an essential step towards detailed scene understanding. Human-object interaction (HOI) detection strives to localize both the human and an object as well as the identification of complex interactions between them. Most existing HOI detection approaches are instance-centric where interactions between all possible human-object pairs are predicted based on appearance features and coarse spatial information. We argue that appearance features alone are insufficient to capture complex human-object interactions. In this paper, we therefore propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs. Our network predicts interaction points, which directly localize and classify the interaction. Paired with the densely predicted interaction vectors, the interactions are associated with human and object detections to obtain final predictions. To the best of our knowledge, we are the first to propose an approach where HOI detection is posed as a keypoint detection and grouping problem. Experiments are performed on two popular benchmarks: V-COCO and HICO-DET. Our approach sets a new state-of-the-art on both datasets. Code is available at https: //github.com/vaes1/IP-Net.
引用
收藏
页码:4115 / 4124
页数:10
相关论文
共 50 条
  • [1] Lifelong Learning for Human-Object Interaction Detection
    Sun, Bo
    Lu, Sixu
    He, Jun
    Yu, Lejun
    2022 IEEE 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2022), 2022, : 582 - 587
  • [2] Relational Context Learning for Human-Object Interaction Detection
    Kim, Sanghyun
    Jung, Deunsol
    Cho, Minsu
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2925 - 2934
  • [3] Affordance Transfer Learning for Human-Object Interaction Detection
    Hou, Zhi
    Yu, Baosheng
    Qiao, Yu
    Peng, Xiaojiang
    Tao, Dacheng
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 495 - 504
  • [4] Human-Object Interaction Detection: An Overview
    Wang J.
    Shuai H.
    Li Y.
    Cheng W.
    IEEE Consumer Electronics Magazine, 2024, 13 (06) : 1 - 14
  • [5] A Survey of Human-Object Interaction Detection
    Gong X.
    Zhang Z.
    Liu L.
    Ma B.
    Wu K.
    Xinan Jiaotong Daxue Xuebao/Journal of Southwest Jiaotong University, 2022, 57 (04): : 693 - 704
  • [6] Learning Human-Object Interaction Detection via Deformable Transformer
    Cai, Shuang
    Ma, Shiwei
    Gu, Dongzhou
    2021 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2021, 12076
  • [7] An Improved Human-Object Interaction Detection Network
    Gao, Song
    Wang, Hongyu
    Song, Jilai
    Xu, Fang
    Zou, Fengshan
    PROCEEDINGS OF 2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (IEEE-ASID'2019), 2019, : 192 - 196
  • [8] Enhanced Transformer Interaction Components for Human-Object Interaction Detection
    Zhang, JinHui
    Zhao, Yuxiao
    Zhang, Xian
    Wang, Xiang
    Zhao, Yuxuan
    Wang, Peng
    Hu, Jian
    ACM SYMPOSIUM ON SPATIAL USER INTERACTION, SUI 2023, 2023,
  • [9] Human-object interaction detection with missing objects
    Kogashi, Kaen
    Wu, Yang
    Nobuhara, Shohei
    Nishino, Ko
    IMAGE AND VISION COMPUTING, 2021, 113
  • [10] Distance Matters in Human-Object Interaction Detection
    Wang, Guangzhi
    Guo, Yangyang
    Wong, Yongkang
    Kankanhalli, Mohan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4546 - 4554