Visual-Semantic Graph Attention Networks for Human-Object Interaction Detection

被引:13
|
作者
Liang, Zhijun [1 ]
Liu, Junfa [1 ]
Guan, Yisheng [1 ]
Rojas, Juan [2 ]
机构
[1] Guangdong Univ Technol, Sch Electromech Engn, Biomimet & Intelligent Robot Lab BIRL, Guangzhou 510006, Peoples R China
[2] Chinese Univ Hong Kong, Dept Mech & Automat Engn, Hong Kong, Peoples R China
关键词
D O I
10.1109/ROBIO54168.2021.9739429
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In scene understanding, robots benefit from not only detecting individual scene instances but also from learning their possible interactions. Human-Object Interaction (HOI) Detection infers the action predicate on a <human, predicate, object> triplet. Contextual information has been found critical in inferring interactions. However, most works only use local features from single human-object pairs for inference. Few works have studied the disambiguating contribution of subsidiary relations made available via graph networks. Similarly, few have leveraged visual cues with the intrinsic semantic regularities embedded in HOIs. We contribute Visual-Semantic Graph Attention Networks (VS-GATs): a dual-graph attention network that effectively aggregates visual, spatial, and semantic contextual information dynamically from primary human-object relations as well as subsidiary relations through attention mechanisms for strong disambiguating power. We achieve competitive results on two benchmarks: V-COCO and HICO-DET. The code is available at https://github.com/ birlrobotics/vs- gats.
引用
收藏
页码:1441 / 1447
页数:7
相关论文
共 50 条
  • [41] Relational Context Learning for Human-Object Interaction Detection
    Kim, Sanghyun
    Jung, Deunsol
    Cho, Minsu
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2925 - 2934
  • [42] DGIG-Net: Dynamic Graph-in-Graph Networks for Few-Shot Human-Object Interaction
    Liu, Xiyao
    Ji, Zhong
    Pang, Yanwei
    Han, Jungong
    Li, Xuelong
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (08) : 7852 - 7864
  • [43] DSSF: Dynamic Semantic Sampling and Fusion for One-Stage Human-Object Interaction Detection
    Gu, Dongzhou
    Ma, Shiwei
    Cai, Shuang
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [44] Graph-based method for human-object interactions detection
    Xia, Li-min
    Wu, Wei
    JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2021, 28 (01) : 205 - 218
  • [45] Neural-Logic Human-Object Interaction Detection
    Li, Liulei
    Wei, Jianan
    Wang, Wenguan
    Yang, Yi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [46] Double Graph Attention Networks for Visual Semantic Navigation
    Lyu, Yunlian
    Talebi, Mohammad Sadegh
    NEURAL PROCESSING LETTERS, 2023, 55 (07) : 9019 - 9040
  • [47] Double Graph Attention Networks for Visual Semantic Navigation
    Yunlian Lyu
    Mohammad Sadegh Talebi
    Neural Processing Letters, 2023, 55 : 9019 - 9040
  • [48] Parallel disentangling network for human-object interaction detection
    Cheng, Yamin
    Duan, Hancong
    Wang, Chen
    Chen, Zhijun
    PATTERN RECOGNITION, 2024, 146
  • [49] Transferable Interactiveness Knowledge for Human-Object Interaction Detection
    Li, Yong-Lu
    Zhou, Siyuan
    Huang, Xijie
    Xu, Liang
    Ma, Ze
    Fang, Hao-Shu
    Wang, Yan-Feng
    Lu, Cewu
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3580 - 3589
  • [50] Affordance Transfer Learning for Human-Object Interaction Detection
    Hou, Zhi
    Yu, Baosheng
    Qiao, Yu
    Peng, Xiaojiang
    Tao, Dacheng
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 495 - 504