Polysemy Deciphering Network for Robust Human-Object Interaction Detection

被引:30
|
作者
Zhong, Xubin [1 ]
Ding, Changxing [1 ,2 ]
Qu, Xian [1 ]
Tao, Dacheng [3 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510000, Peoples R China
[2] Pazhou Lab, Guangzhou 510330, Peoples R China
[3] JD Com, JD Explore Acad, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Human-object interaction; Verb polysemy; Language priors; Attention model;
D O I
10.1007/s11263-021-01458-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human-Object Interaction (HOI) detection is important to human-centric scene understanding tasks. Existing works tend to assume that the same verb has similar visual characteristics in different HOI categories, an approach that ignores the diverse semantic meanings of the verb. To address this issue, in this paper, we propose a novel Polysemy Deciphering Network (PD-Net) that decodes the visual polysemy of verbs for HOI detection in three distinct ways. First, we refine features for HOI detection to be polysemy-aware through the use of two novel modules: namely, Language Prior-guided Channel Attention (LPCA) and Language Prior-based Feature Augmentation (LPFA). LPCA highlights important elements in human and object appearance features for each HOI category to be identified; moreover, LPFA augments human pose and spatial features for HOI detection using language priors, enabling the verb classifiers to receive language hints that reduce intra-class variation for the same verb. Second, we introduce a novel Polysemy-Aware Modal Fusion module, which guides PD-Net to make decisions based on feature types deemed more important according to the language priors. Third, we propose to relieve the verb polysemy problem through sharing verb classifiers for semantically similar HOI categories. Furthermore, to expedite research on the verb polysemy problem, we build a new benchmark dataset named HOI-VerbPolysemy (HOI-VP), which includes common verbs (predicates) that have diverse semantic meanings in the real world. Finally, through deciphering the visual polysemy of verbs, our approach is demonstrated to outperform state-of-the-art methods by significant margins on the HICO-DET, V-COCO, and HOI-VP databases. Code and data in this paper are available at .
引用
收藏
页码:1910 / 1929
页数:20
相关论文
共 50 条
  • [31] Human-Object Interaction Detection Based on Star Graph
    Cai, Shuang
    Ma, Shiwei
    Gu, Dongzhou
    Wang, Chang
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (09)
  • [32] Affordance Transfer Learning for Human-Object Interaction Detection
    Hou, Zhi
    Yu, Baosheng
    Qiao, Yu
    Peng, Xiaojiang
    Tao, Dacheng
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 495 - 504
  • [33] Structured LSTM for Human-Object Interaction Detection and Anticipation
    Anh Minh Truong
    Yoshitaka, Atsuo
    2017 14TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2017,
  • [34] Spatial-Net for Human-Object Interaction Detection
    Mansour, Ahmed E.
    Mohammed, Ammar
    Elsayed, Hussein Abd El Atty
    Elramly, Salwa
    IEEE ACCESS, 2022, 10 : 88920 - 88931
  • [35] Deep Contextual Attention for Human-Object Interaction Detection
    Wang, Tiancai
    Anwer, Rao Muhammad
    Khan, Muhammad Haris
    Khan, Fahad Shahbaz
    Pang, Yanwei
    Shao, Ling
    Laaksonen, Jorma
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5693 - 5701
  • [36] Spatial-Net for Human-Object Interaction Detection
    Mansour, Ahmed E.
    Mohammed, Ammar
    Elsayed, Hussein Abd El Atty
    Elramly, Salwa
    IEEE Access, 2022, 10 : 88920 - 88931
  • [37] Human-Object Interaction Detection via Disentangled Transformer
    Zhou, Desen
    Liu, Zhichao
    Wang, Jian
    Wang, Leshan
    Hu, Tao
    Ding, Errui
    Wang, Jingdong
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19546 - 19555
  • [38] Human-Object Interaction Detection with Ratio-Transformer
    Wang, Tianlang
    Lu, Tao
    Fang, Wenhua
    Zhang, Yanduo
    SYMMETRY-BASEL, 2022, 14 (08):
  • [39] Geometric Features Enhanced Human-Object Interaction Detection
    Zhu, Manli
    Ho, Edmond S. L.
    Chen, Shuang
    Yang, Longzhi
    Shum, Hubert P. H.
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 1
  • [40] Transferable Interactiveness Knowledge for Human-Object Interaction Detection
    Li, Yong-Lu
    Liu, Xinpeng
    Wu, Xiaoqian
    Huang, Xijie
    Xu, Liang
    Lu, Cewu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3870 - 3882