Detecting Human-Object Contact in Images

被引:3
|
作者
Chen, Yixin [1 ]
Dwivedi, Sai Kumar [2 ]
Black, Michael J. [2 ]
Tzionas, Dimitrios [3 ]
机构
[1] Beijing Inst Gen Artificial Intelligence, Beijing, Peoples R China
[2] Max Planck Inst Intelligent Syst, Tubingen, Germany
[3] Univ Amsterdam, Amsterdam, Netherlands
关键词
D O I
10.1109/CVPR52729.2023.01640
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Humans constantly contact objects to move and perform tasks. Thus, detecting human-object contact is important for building human-centered artificial intelligence. However, there exists no robust method to detect contact between the body and the scene from an image, and there exists no dataset to learn such a detector. We fill this gap with HOT ("Human-Object conTact"), a new dataset of human-object contacts in images. To build HOT, we use two data sources: (1) We use the PROX dataset of 3D human meshes moving in 3D scenes, and automatically annotate 2D image areas for contact via 3D mesh proximity and projection. (2) We use the V-COCO, HAKE and Watch-n-Patch datasets, and ask trained annotators to draw polygons around the 2D image areas where contact takes place. We also annotate the involved body part of the human body. We use our HOT dataset to train a new contact detector, which takes a single color image as input, and outputs 2D contact heatmaps as well as the body-part labels that are in contact. This is a new and challenging task, that extends current foot-ground or hand-object contact detectors to the full generality of the whole body. The detector uses a part-attention branch to guide contact estimation through the context of the surrounding body parts and scene. We evaluate our detector extensively, and quantitative results show that our model outperforms baselines, and that all components contribute to better performance. Results on images from an online repository show reasonable detections and generalizability. Our HOT data and model are available for research at https://hot.is.tue.mpg.de.
引用
收藏
页码:17100 / 17110
页数:11
相关论文
共 50 条
  • [1] Detecting and Recognizing Human-Object Interactions
    Gkioxari, Georgia
    Girshick, Ross
    Dollar, Piotr
    He, Kaiming
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8359 - 8367
  • [2] Detecting Human-Object Relationships in Videos
    Ji, Jingwei
    Desai, Rishi
    Niebles, Juan Carlos
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8086 - 8096
  • [3] Detecting Human-Object Interaction with Mixed Supervision
    Kumaraswamy, Suresh Kirthi
    Shi, Miaojing
    Kijak, Ewa
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1227 - 1236
  • [4] Detecting Subtle Human-Object Interactions Using Kinect
    Ubalde, Sebastian
    Liu, Zicheng
    Mejail, Marta
    [J]. PROGRESS IN PATTERN RECOGNITION IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2014, 2014, 8827 : 770 - 777
  • [5] Detecting Human-Object Interactions via Functional Generalization
    Bansal, Ankan
    Rambhatla, Sai Saketh
    Shrivastava, Abhinav
    Chellappa, Rama
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10460 - 10469
  • [6] Spatially Conditioned Graphs for Detecting Human-Object Interactions
    Zhang, Frederic Z.
    Campbell, Dylan
    Gould, Stephen
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13299 - 13307
  • [7] Spatially Conditioned Graphs for Detecting Human-Object Interactions
    Zhang, Frederic Z.
    Campbell, Dylan
    Gould, Stephen
    [J]. Proceedings of the IEEE International Conference on Computer Vision, 2021, : 13299 - 13307
  • [8] HICO: A Benchmark for Recognizing Human-Object Interactions in Images
    Chao, Yu-Wei
    Wang, Zhan
    He, Yugeng
    Wang, Jiaxuan
    Deng, Jia
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1017 - 1025
  • [9] Exploring Predicate Visual Context in Detecting of Human-Object Interactions
    Zhang, Frederic Z.
    Yuan, Yuhui
    Campbell, Dylan
    Zhong, Zhuoyao
    Gould, Stephen
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 10377 - 10387
  • [10] Detecting Human-Object Interaction via Fabricated Compositional Learning
    Hou, Zhi
    Yu, Baosheng
    Qiao, Yu
    Peng, Xiaojiang
    Tao, Dacheng
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14641 - 14650