PPDM plus plus : Parallel Point Detection and Matching for Fast and Accurate HOI Detection

被引:3
|
作者
Liao, Yue [1 ]
Liu, Si [1 ]
Gao, Yulu [1 ]
Zhang, Aixi [1 ]
Li, Zhimin [2 ]
Wang, Fei [3 ]
Li, Bo [1 ]
机构
[1] Beihang Univ, Inst Artificial Intelligence, Beijing 100191, Peoples R China
[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[3] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230052, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Proposals; Feature extraction; Task analysis; Detectors; Real-time systems; Matched filters; Bicycles; Human-object interaction detection; visual relationship detection; one-stage detector; dataset;
D O I
10.1109/TPAMI.2024.3386891
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human-Object Interaction (HOI) detection aims to understand human activities by detecting interaction triplets. Previous HOI detection methods adopt a two-stage instance-driven paradigm. Unfortunately, many non-interactive human-object pairs generated by the first stage are the main obstacle impeding HOI detectors from high efficiency and promising performance. To remedy this, we propose a novel top-down interaction-driven paradigm, detecting interactions first and bridging interactive human-object pairs through interactions. We formulate HOI as a point triplet $< $<human point, interaction point, object point$> $> and design a Parallel Point Detection and Matching (PPDM) framework. We further take advantage of two-stage methods and propose a novel framework, PPDM++, that detects the interactive human-object pairs by PPDM, then extracts region features for each pair to predict actions. The core of PPDM/PPDM++ is to convert the instance-driven bottom-up paradigm to an interaction-driven top-down paradigm, thus avoiding additional computation costs from traversing a tremendous number of non-interactive pairs. Benefiting from the advanced paradigm, PPDM/PPDM++ has achieved significant performance gains with high efficiency. PPDM-DLA-34 has achieved 19.94 mAP with 42 FPS as the first real-time HOI detector, and PPDM++-SwinB achieves 30.1 mAP with 17 FPS on HICO-DET dataset. We also built an application-oriented database named HOI-A, a supplement to the existing datasets.
引用
收藏
页码:6826 / 6841
页数:16
相关论文
共 50 条
  • [21] IoU Regression with H plus L-Sampling for Accurate Detection Confidence
    Wang, Dong
    Wu, Huaming
    SENSORS, 2021, 21 (13)
  • [22] R-FCN plus plus : Towards Accurate Region-Based Fully Convolutional Networks for Object Detection
    Li, Zeming
    Chen, Yilun
    Yu, Gang
    Deng, Yangdong
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7073 - 7080
  • [23] Karamelo: an open source parallel C plus plus package for the material point method
    de Vaucorbeil, Alban
    Nguyen, Vinh Phu
    Nguyen-Thanh, Chi
    COMPUTATIONAL PARTICLE MECHANICS, 2021, 8 (04) : 767 - 789
  • [24] Locust: C plus plus software for simulation of RF detection
    Esfahani, A. Ashtari
    Boeser, S.
    Buzinsky, N.
    Cervantes, R.
    Claessens, C.
    de Viveiros, L.
    Fertl, M.
    Formaggio, J. A.
    Gladstone, L.
    Guigue, M.
    Heeger, K. M.
    Johnston, J.
    Jones, A. M.
    Kazkaz, K.
    LaRoque, B. H.
    Lindman, A.
    Machado, E.
    Monreal, B.
    Morrison, E. C.
    Nikkel, J. A.
    Novitski, E.
    Oblath, N. S.
    Pettus, W.
    Robertson, R. G. H.
    Rybka, G.
    Saldana, L.
    Sibille, V
    Schram, M.
    Slocum, P. L.
    Sun, Y-H
    Tedeschi, J. R.
    Thuemmler, T.
    VanDevender, B. A.
    Wachtendonk, M.
    Walter, M.
    Weiss, T. E.
    Wendler, T.
    Zayas, E.
    NEW JOURNAL OF PHYSICS, 2019, 21 (11):
  • [25] CrossDet plus plus : Growing Crossline Representation for Object Detection
    Qiu, Heqian
    Li, Hongliang
    Wu, Qingbo
    Cui, Jianhua
    Song, Zichen
    Wang, Lanxiao
    Zhang, Minjian
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) : 1093 - 1108
  • [26] A CenterNet plus plus model for ship detection in SAR images
    Guo, Haoyuan
    Yang, Xi
    Wang, Nannan
    Gao, Xinbo
    PATTERN RECOGNITION, 2021, 112
  • [27] YOLOX plus plus for Transmission Line Abnormal Target Detection
    Bi, Zhongqin
    Jing, Lina
    Sun, Chao
    Shan, Meijing
    IEEE ACCESS, 2023, 11 : 38157 - 38167
  • [28] Early Detection of Type Errors in C plus plus Templates
    Chen, Sheng
    Erwig, Martin
    PEPM '14: PROCEEDINGS OF THE ACM SIGPLAN WORKSHOP ON PARTIAL EVALUATION AND PROGRAM MANIPULATION, 2014, : 133 - 144
  • [29] DeepPrior plus plus : Improving Fast and Accurate 3D Hand Pose Estimation
    Oberweger, Markus
    Lepetit, Vincent
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 585 - 594
  • [30] RoCNet plus plus : Triangle-based descriptor for accurate and robust point cloud registration
    Slimani, Karim
    Achard, Catherine
    Tamadazte, Brahim
    PATTERN RECOGNITION, 2024, 147