Effective actor-centric human-object interaction detection

被引:3
|
作者
Xu, Kunlun [1 ]
Li, Zhimin [1 ]
Zhang, Zhijun [1 ]
Dong, Leizhen [1 ]
Xu, Wenhui [1 ]
Yan, Luxin [1 ]
Zhong, Sheng [1 ]
Zou, Xu [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Human-object interaction detection; Global context utilizing; Pixel-wise prediction; Deep learning;
D O I
10.1016/j.imavis.2022.104422
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While Human-Object Interaction (HOI) Detection has achieved tremendous advances in recent, it still remains challenging due to complex interactions with multiple humans and objects occurring in images, which would inevitably lead to ambiguities. Most existing methods either generate all human-object pair candidates and infer their relationships by cropped local features successively in a two-stage manner, or directly predict interaction points in a one-stage procedure. However, the lack of spatial configurations or reasoning steps of two- or onestage methods respectively limits their performance in such complex scenes. To avoid this ambiguity, we propose a novel actor-centric framework. The main ideas are that when inferring interactions: 1) the non-local features of the entire image guided by actor position are obtained to model the relationship between the actor and context, and then 2) we use an object branch to generate pixel-wise interaction area prediction, where the interaction area denotes the object central area. Moreover, we also use an actor branch to get interaction prediction of the actor and propose a novel composition strategy based on center-point indexing to generate the final HOI prediction. Thanks to the usage of the non-local features and the partly-coupled property of the human-objects composition strategy, our proposed framework can detect HOI more accurately especially for complex images. Extensive experimental results show that our method achieves the state-of-the-art on the challenging V-COCO and HICO-DET benchmarks and is more robust especially in multiple persons and/or objects scenes.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Human-Centric Parsing Network for Human-Object Interaction Detection
    Chen, Guanyu
    Chen, Chong
    Zhao, Zhicheng
    Su, Fei
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5488 - 5494
  • [2] Object Centric Body Part Attention Network for Human-Object Interaction Detection
    Liu, Zhuang
    Zhang, Xiaowei
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 378 - 391
  • [3] iCGPN: Interaction-centric graph parsing network for human-object interaction detection
    Yang, Wenhao
    Chen, Guanyu
    Zhao, Zhicheng
    Su, Fei
    Meng, Hongying
    [J]. NEUROCOMPUTING, 2022, 502 : 98 - 109
  • [4] Actor-Centric Relation Network
    Sun, Chen
    Shrivastava, Abhinav
    Vondrick, Carl
    Murphy, Kevin
    Sukthankar, Rahul
    Schmid, Cordelia
    [J]. COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 : 335 - 351
  • [5] A Survey of Human-Object Interaction Detection
    Gong, Xun
    Zhang, Zhiying
    Liu, Lu
    Ma, Bing
    Wu, Kunlun
    [J]. Xinan Jiaotong Daxue Xuebao/Journal of Southwest Jiaotong University, 2022, 57 (04): : 693 - 704
  • [6] Actor-centric modeling of user rights
    Breu, R
    Popp, G
    [J]. FUNDAMENTAL APPROACHES TO SOFTWARE ENGINEERING, PROCEEDINGS, 2004, 2984 : 165 - 179
  • [7] An Improved Human-Object Interaction Detection Network
    Gao, Song
    Wang, Hongyu
    Song, Jilai
    Xu, Fang
    Zou, Fengshan
    [J]. PROCEEDINGS OF 2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (IEEE-ASID'2019), 2019, : 192 - 196
  • [8] Distance Matters in Human-Object Interaction Detection
    Wang, Guangzhi
    Guo, Yangyang
    Wong, Yongkang
    Kankanhalli, Mohan
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4546 - 4554
  • [9] Human-object interaction detection with missing objects
    Kogashi, Kaen
    Wu, Yang
    Nobuhara, Shohei
    Nishino, Ko
    [J]. IMAGE AND VISION COMPUTING, 2021, 113
  • [10] YouMVOS: An Actor-centric Multi-shot Video Object Segmentation Dataset
    Wei, Donglai
    Kharbanda, Siddhant
    Arora, Sarthak
    Roy, Roshan
    Jain, Nishant
    Palrecha, Akash
    Shah, Tanav
    Mathur, Shray
    Mathur, Ritik
    Kemkar, Abhijay
    Chakravarthy, Anirudh
    Lin, Zudi
    Jang, Won-Dong
    Tang, Yansong
    Bai, Song
    Tompkin, James
    Torr, Philip H. S.
    Pfister, Hanspeter
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 21012 - 21021