Visual-Semantic Graph Attention Networks for Human-Object Interaction Detection

被引:13
|
作者
Liang, Zhijun [1 ]
Liu, Junfa [1 ]
Guan, Yisheng [1 ]
Rojas, Juan [2 ]
机构
[1] Guangdong Univ Technol, Sch Electromech Engn, Biomimet & Intelligent Robot Lab BIRL, Guangzhou 510006, Peoples R China
[2] Chinese Univ Hong Kong, Dept Mech & Automat Engn, Hong Kong, Peoples R China
关键词
D O I
10.1109/ROBIO54168.2021.9739429
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In scene understanding, robots benefit from not only detecting individual scene instances but also from learning their possible interactions. Human-Object Interaction (HOI) Detection infers the action predicate on a <human, predicate, object> triplet. Contextual information has been found critical in inferring interactions. However, most works only use local features from single human-object pairs for inference. Few works have studied the disambiguating contribution of subsidiary relations made available via graph networks. Similarly, few have leveraged visual cues with the intrinsic semantic regularities embedded in HOIs. We contribute Visual-Semantic Graph Attention Networks (VS-GATs): a dual-graph attention network that effectively aggregates visual, spatial, and semantic contextual information dynamically from primary human-object relations as well as subsidiary relations through attention mechanisms for strong disambiguating power. We achieve competitive results on two benchmarks: V-COCO and HICO-DET. The code is available at https://github.com/ birlrobotics/vs- gats.
引用
收藏
页码:1441 / 1447
页数:7
相关论文
共 50 条
  • [21] Language-guided graph parsing attention network for human-object interaction recognition
    Li, Qiyue
    Xie, Xuemei
    Zhang, Jin
    Shi, Guangming
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 89
  • [22] Hierarchical Graph Attention Network for Few-shot Visual-Semantic Learning
    Yin, Chengxiang
    Wu, Kun
    Che, Zhengping
    Jiang, Bo
    Xu, Zhiyuan
    Tang, Jian
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2157 - 2166
  • [23] An Improved Human-Object Interaction Detection Network
    Gao, Song
    Wang, Hongyu
    Song, Jilai
    Xu, Fang
    Zou, Fengshan
    PROCEEDINGS OF 2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (IEEE-ASID'2019), 2019, : 192 - 196
  • [24] Improved Visual-Semantic Alignment for Zero-Shot Object Detection
    Rahman, Shafin
    Khan, Salman
    Barnes, Nick
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11932 - 11939
  • [25] Human-object interaction detection with missing objects
    Kogashi, Kaen
    Wu, Yang
    Nobuhara, Shohei
    Nishino, Ko
    IMAGE AND VISION COMPUTING, 2021, 113
  • [26] Distance Matters in Human-Object Interaction Detection
    Wang, Guangzhi
    Guo, Yangyang
    Wong, Yongkang
    Kankanhalli, Mohan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4546 - 4554
  • [27] Agglomerative Transformer for Human-Object Interaction Detection
    Tu, Danyang
    Sun, Wei
    Zhai, Guangtao
    Shen, Wei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21557 - 21567
  • [28] Diagnosing Rarity in Human-object Interaction Detection
    Kilickaya, Mert
    Smeulders, Arnold
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3956 - 3960
  • [29] Human-Object Interaction Detection with Missing Objects
    Kogashi, Kaen
    Wu, Yang
    Nobuhara, Shohei
    Nishino, Ko
    PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,
  • [30] Parallel Queries for Human-Object Interaction Detection
    Chen, Junwen
    Yanai, Keiji
    PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA IN ASIA, MMASIA 2022, 2022,