Polysemy Deciphering Network for Robust Human-Object Interaction Detection

被引:30
|
作者
Zhong, Xubin [1 ]
Ding, Changxing [1 ,2 ]
Qu, Xian [1 ]
Tao, Dacheng [3 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510000, Peoples R China
[2] Pazhou Lab, Guangzhou 510330, Peoples R China
[3] JD Com, JD Explore Acad, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Human-object interaction; Verb polysemy; Language priors; Attention model;
D O I
10.1007/s11263-021-01458-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human-Object Interaction (HOI) detection is important to human-centric scene understanding tasks. Existing works tend to assume that the same verb has similar visual characteristics in different HOI categories, an approach that ignores the diverse semantic meanings of the verb. To address this issue, in this paper, we propose a novel Polysemy Deciphering Network (PD-Net) that decodes the visual polysemy of verbs for HOI detection in three distinct ways. First, we refine features for HOI detection to be polysemy-aware through the use of two novel modules: namely, Language Prior-guided Channel Attention (LPCA) and Language Prior-based Feature Augmentation (LPFA). LPCA highlights important elements in human and object appearance features for each HOI category to be identified; moreover, LPFA augments human pose and spatial features for HOI detection using language priors, enabling the verb classifiers to receive language hints that reduce intra-class variation for the same verb. Second, we introduce a novel Polysemy-Aware Modal Fusion module, which guides PD-Net to make decisions based on feature types deemed more important according to the language priors. Third, we propose to relieve the verb polysemy problem through sharing verb classifiers for semantically similar HOI categories. Furthermore, to expedite research on the verb polysemy problem, we build a new benchmark dataset named HOI-VerbPolysemy (HOI-VP), which includes common verbs (predicates) that have diverse semantic meanings in the real world. Finally, through deciphering the visual polysemy of verbs, our approach is demonstrated to outperform state-of-the-art methods by significant margins on the HICO-DET, V-COCO, and HOI-VP databases. Code and data in this paper are available at .
引用
收藏
页码:1910 / 1929
页数:20
相关论文
共 50 条
  • [41] Exploiting Scene Graphs for Human-Object Interaction Detection
    He, Tao
    Gao, Lianli
    Song, Jingkuan
    Li, Yuan-Fang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15964 - 15973
  • [42] Weakly-supervised Human-object Interaction Detection
    Sugimoto, Masaki
    Furuta, Ryosuke
    Taniguchi, Yukinobu
    VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 2021, : 293 - 300
  • [43] Highlighting Object Category Immunity for the Generalization of Human-Object Interaction Detection
    Liu, Xinpeng
    Li, Yong-Lu
    Lu, Cewu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1819 - 1827
  • [44] HOD: Human-Object Decoupling Network for HOI Detection
    Zhang, Hantao
    Wan, Shouhong
    Guo, Weidong
    Jin, Peiquan
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2219 - 2224
  • [45] Three-stream network with context convolution module for human-object interaction detection
    Siadari, Thomhert S.
    Han, Mikyong
    Yoon, Hyunjin
    ETRI JOURNAL, 2020, 42 (02) : 230 - 238
  • [46] Human-object Interaction Recognition Using Multitask Neural Network
    Yan, Weihao
    Gao, Yue
    Liu, Qiming
    2019 3RD INTERNATIONAL SYMPOSIUM ON AUTONOMOUS SYSTEMS (ISAS 2019), 2019, : 323 - 328
  • [47] Human object interaction detection based on feature optimization and key human-object enhancement
    Ye, Qing
    Wang, Xikun
    Li, Rui
    Zhang, Yongmei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 93
  • [48] Segmenting Key Clues to Induce Human-Object Interaction Detection
    Xue, Mingliang
    Wang, Siwei
    Fu, Bing
    Zhao, Zhengyang
    Liu, Tao
    Lai, Lingfeng
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 60 - 71
  • [49] Rethinking vision transformer through human-object interaction detection
    Cheng, Yamin
    Zhao, Zitian
    Wang, Zhi
    Duan, Hancong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 122
  • [50] Egocentric Human-Object Interaction Detection Exploiting Synthetic Data
    Leonardi, Rosario
    Ragusa, Francesco
    Furnari, Antonino
    Farinella, Giovanni Maria
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT II, 2022, 13232 : 237 - 248