Parallel Queries for Human-Object Interaction Detection

被引:1
|
作者
Chen, Junwen [1 ]
Yanai, Keiji [1 ]
机构
[1] Univ Elect Commun, Tokyo, Japan
关键词
human-object interaction detection; object detection; transformer;
D O I
10.1145/3551626.3564944
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Human-Object Interaction (HOI) Detection requires localizing a pair of humans and objects. Recent transformer-based methods leverage the query embeddings to represent the entire HOI instances. The target embeddings after decoding are used to represent the object and human characteristics at the same time. However, it is ambiguous to use the highly integrated embeddings to localize the human and object simultaneously. To address this problem, we split the detection decoding process into subject decoding and object decoding to detect the humans and objects in parallel. Our proposed method, Parallel Query Network (PQNet) uses two transformer decoders to decode the subject embeddings and object embeddings in parallel, and a novel verb decoder is used to fuse the representation from the detection decoding and predict the interaction. The attention mechanisms in the verb decoder consist of the attention between human and object embeddings and the attention between the fused embeddings and global semantic features. As the transformer architecture maintains the permutation of the input query embeddings, the paired boxes of humans and objects are directly predicted by feed-forward networks. With the full usage of the object detection part, our proposed architecture outperforms the state-of-the-art baseline method with half of the training epochs.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Egocentric Human-Object Interaction Detection Exploiting Synthetic Data
    Leonardi, Rosario
    Ragusa, Francesco
    Furnari, Antonino
    Farinella, Giovanni Maria
    [J]. IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT II, 2022, 13232 : 237 - 248
  • [42] Effective actor-centric human-object interaction detection
    Xu, Kunlun
    Li, Zhimin
    Zhang, Zhijun
    Dong, Leizhen
    Xu, Wenhui
    Yan, Luxin
    Zhong, Sheng
    Zou, Xu
    [J]. IMAGE AND VISION COMPUTING, 2022, 121
  • [43] Segmenting Key Clues to Induce Human-Object Interaction Detection
    Xue, Mingliang
    Wang, Siwei
    Fu, Bing
    Zhao, Zhengyang
    Liu, Tao
    Lai, Lingfeng
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 60 - 71
  • [44] Improved human-object interaction detection through skeleton-object relations
    Zhang, Hong-Bo
    Zhou, Yi-Zhong
    Du, Ji-Xiang
    Huang, Jin-Long
    Lei, Qing
    Yang, Lijie
    [J]. JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (01) : 41 - 52
  • [45] Learning Human-Object Interaction Detection via Deformable Transformer
    Cai, Shuang
    Ma, Shiwei
    Gu, Dongzhou
    [J]. 2021 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2021, 12076
  • [46] Relation Parsing Neural Network for Human-Object Interaction Detection
    Zhou, Penghao
    Chi, Mingmin
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 843 - 851
  • [47] Object Centric Body Part Attention Network for Human-Object Interaction Detection
    Liu, Zhuang
    Zhang, Xiaowei
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 378 - 391
  • [48] Chairs Can Be Stood On: Overcoming Object Bias in Human-Object Interaction Detection
    Wang, Guangzhi
    Guo, Yangyang
    Wong, Yongkang
    Kankanhalli, Mohan
    [J]. COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 654 - 672
  • [49] Parallel Multi-Head Graph Attention Network (PMGAT) Model for Human-Object Interaction Detection
    Zhang, Jiali
    Yunos, Zuriahati Mohd
    Haron, Habibollah
    [J]. IEEE ACCESS, 2023, 11 : 131708 - 131725
  • [50] Pairwise Negative Sample Mining for Human-Object Interaction Detection
    Jia, Weizhe
    Ma, Shiwei
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII, 2024, 14431 : 425 - 437