Semantic Inference Network for Human-Object Interaction Detection

被引:0
|
作者
Liu, Hongyi [1 ]
Mo, Lisha [1 ]
Ma, Huimin [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Human-object interaction; Visual relationship detection; Word embedding;
D O I
10.1007/978-3-030-34120-6_42
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recently many efforts have been made to understand the scenes in images. The interactions between human and objects are usually of great significance to scene understanding. In this paper, we focus on the task of detecting human-object interactions (HOI), which is to detect triplets < human, verb, object > in challenging daily images. We propose a novel model which introduces a semantic stream and a new form of loss function. Our intuition is that the semantic information of object classes is beneficial to HOI detection. Semantic information is extracted by embedding the category information of objects with pre-trained BERT model. On the other hand, we find that the HOI task suffers severely from extreme imbalance between positive and negative samples. We propose a weighted focal loss (WFL) to tackle this problem. The results show that our method achieves a gain of 5% compared with our baseline.
引用
下载
收藏
页码:518 / 529
页数:12
相关论文
共 50 条
  • [1] An Improved Human-Object Interaction Detection Network
    Gao, Song
    Wang, Hongyu
    Song, Jilai
    Xu, Fang
    Zou, Fengshan
    PROCEEDINGS OF 2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (IEEE-ASID'2019), 2019, : 192 - 196
  • [2] Pose attention and object semantic representation-based human-object interaction detection network
    Deng, Wei-Mo
    Zhang, Hong-Bo
    Lei, Qing
    Du, Ji-Xiang
    Huang, Min
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (27) : 39453 - 39470
  • [3] Pose attention and object semantic representation-based human-object interaction detection network
    Wei-Mo Deng
    Hong-Bo Zhang
    Qing Lei
    Ji-Xiang Du
    Min Huang
    Multimedia Tools and Applications, 2022, 81 : 39453 - 39470
  • [4] Parallel disentangling network for human-object interaction detection
    Cheng, Yamin
    Duan, Hancong
    Wang, Chen
    Chen, Zhijun
    PATTERN RECOGNITION, 2024, 146
  • [5] Hierarchical Reasoning Network for Human-Object Interaction Detection
    Gao, Yiming
    Kuang, Zhanghui
    Li, Guanbin
    Zhang, Wayne
    Lin, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 8306 - 8317
  • [6] ERNet: An Efficient and Reliable Human-Object Interaction Detection Network
    Lim, JunYi
    Baskaran, Vishnu Monn
    Lim, Joanne Mun-Yee
    Wong, KokSheik
    See, John
    Tistarelli, Massimo
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 964 - 979
  • [7] Multi-stream Network for Human-object Interaction Detection
    Wang, Chang
    Sun, Jinyu
    Ma, Shiwei
    Lu, Yuqiu
    Liu, Wang
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (08)
  • [8] Polysemy Deciphering Network for Robust Human-Object Interaction Detection
    Zhong, Xubin
    Ding, Changxing
    Qu, Xian
    Tao, Dacheng
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (06) : 1910 - 1929
  • [9] Pose graph parsing network for human-object interaction detection
    Su, Zhan
    Wang, Yuting
    Xie, Qing
    Yu, Ruiyun
    NEUROCOMPUTING, 2022, 476 : 53 - 62
  • [10] Human-Centric Parsing Network for Human-Object Interaction Detection
    Chen, Guanyu
    Chen, Chong
    Zhao, Zhicheng
    Su, Fei
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5488 - 5494