Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses

被引:148
|
作者
Yao, Bangpeng [1 ]
Fei-Fei, Li [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Mutual context; action recognition; human pose estimation; object detection; conditional random field;
D O I
10.1109/TPAMI.2012.67
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting objects in cluttered scenes and estimating articulated human body parts from 2D images are two challenging problems in computer vision. The difficulty is particularly pronounced in activities involving human-object interactions (e.g., playing tennis), where the relevant objects tend to be small or only partially visible and the human body parts are often self-occluded. We observe, however, that objects and human poses can serve as mutual context to each other-recognizing one facilitates the recognition of the other. In this paper, we propose a mutual context model to jointly model objects and human poses in human-object interaction activities. In our approach, object detection provides a strong prior for better human pose estimation, while human pose estimation improves the accuracy of detecting the objects that interact with the human. On a six-class sports data set and a 24-class people interacting with musical instruments data set, we show that our mutual context model outperforms state of the art in detecting very difficult objects and estimating human poses, as well as classifying human-object interaction activities.
引用
收藏
页码:1691 / 1703
页数:13
相关论文
共 50 条
  • [21] Learning Human-Object Interactions by Attention Aggregation
    Gu, Dongzhou
    Cai, Shuang
    Ma, Shiwei
    [J]. SIXTH INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2021, 11913
  • [22] Predicting Human-Object Interactions in Egocentric Videos
    Benavent-Lledo, Manuel
    Oprea, Sergiu
    Alejandro Castro-Vargas, John
    Mulero-Perez, David
    Garcia-Rodriguez, Jose
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [23] Language for Learning Complex Human-Object Interactions
    Patel, Mitesh
    Ek, Carl Henrik
    Kyriazis, Nikolaos
    Argyros, Antonis
    Miro, Jaime Valls
    Kragic, Danica
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2013, : 4997 - 5002
  • [24] Learning to Detect Human-Object Interactions with Knowledge
    Xu, Bingjie
    Wong, Yongkang
    Li, Junnan
    Zhao, Qi
    Kankanhalli, Mohan S.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2019 - 2028
  • [25] NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions
    Zhang, Juze
    Luo, Haimin
    Yang, Hongdi
    Xu, Xinru
    Wu, Qianyang
    Shi, Ye
    Yu, Jingyi
    Xu, Lan
    Wang, Jingya
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 8834 - 8845
  • [26] Relational Context Learning for Human-Object Interaction Detection
    Kim, Sanghyun
    Jung, Deunsol
    Cho, Minsu
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2925 - 2934
  • [27] Recognizing Human Actions From Still Images
    Kilickaya, Mert
    Telatar, Ziya
    [J]. 2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
  • [28] Predicting the Location of "interactees" in Novel Human-Object Interactions
    Chen, Chao-Yeh
    Grauman, Kristen
    [J]. COMPUTER VISION - ACCV 2014, PT I, 2015, 9003 : 351 - 367
  • [29] Detecting Subtle Human-Object Interactions Using Kinect
    Ubalde, Sebastian
    Liu, Zicheng
    Mejail, Marta
    [J]. PROGRESS IN PATTERN RECOGNITION IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2014, 2014, 8827 : 770 - 777
  • [30] Exemplar-Based Recognition of Human-Object Interactions
    Hu, Jian-Fang
    Zheng, Wei-Shi
    Lai, Jianhuang
    Gong, Shaogang
    Xiang, Tao
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2016, 26 (04) : 647 - 660