Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses

被引:148
|
作者
Yao, Bangpeng [1 ]
Fei-Fei, Li [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Mutual context; action recognition; human pose estimation; object detection; conditional random field;
D O I
10.1109/TPAMI.2012.67
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting objects in cluttered scenes and estimating articulated human body parts from 2D images are two challenging problems in computer vision. The difficulty is particularly pronounced in activities involving human-object interactions (e.g., playing tennis), where the relevant objects tend to be small or only partially visible and the human body parts are often self-occluded. We observe, however, that objects and human poses can serve as mutual context to each other-recognizing one facilitates the recognition of the other. In this paper, we propose a mutual context model to jointly model objects and human poses in human-object interaction activities. In our approach, object detection provides a strong prior for better human pose estimation, while human pose estimation improves the accuracy of detecting the objects that interact with the human. On a six-class sports data set and a 24-class people interacting with musical instruments data set, we show that our mutual context model outperforms state of the art in detecting very difficult objects and estimating human poses, as well as classifying human-object interaction activities.
引用
收藏
页码:1691 / 1703
页数:13
相关论文
共 50 条
  • [31] Skew-Robust Human-Object Interactions in Videos
    Agarwal, Apoorva
    Dabral, Rishabh
    Jain, Arjun
    Ramakrishnan, Ganesh
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5087 - 5096
  • [32] Spatially Conditioned Graphs for Detecting Human-Object Interactions
    Zhang, Frederic Z.
    Campbell, Dylan
    Gould, Stephen
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13299 - 13307
  • [33] Human-Object Interactions Are More than the Sum of Their Parts
    Baldassano, Christopher
    Beck, Diane M.
    Fei-Fei, Li
    [J]. CEREBRAL CORTEX, 2017, 27 (03) : 2276 - 2288
  • [34] Detecting Human-Object Interactions via Functional Generalization
    Bansal, Ankan
    Rambhatla, Sai Saketh
    Shrivastava, Abhinav
    Chellappa, Rama
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10460 - 10469
  • [35] Spatially Conditioned Graphs for Detecting Human-Object Interactions
    Zhang, Frederic Z.
    Campbell, Dylan
    Gould, Stephen
    [J]. Proceedings of the IEEE International Conference on Computer Vision, 2021, : 13299 - 13307
  • [36] Detection of Generic Human-Object Interactions in Video Streams
    Bruckschen, Lilli
    Amft, Sabrina
    Tanke, Julian
    Gall, Juergen
    Bennewitz, Maren
    [J]. SOCIAL ROBOTICS, ICSR 2019, 2019, 11876 : 108 - 118
  • [37] Modeling 4D Human-Object Interactions for Joint Event Segmentation, Recognition, and Object Localization
    Wei, Ping
    Zhao, Yibiao
    Zheng, Nanning
    Zhu, Song-Chun
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) : 1165 - 1179
  • [38] A SIMILARITY MEASURE FOR ANALYZING HUMAN ACTIVITIES USING HUMAN-OBJECT INTERACTION CONTEXT
    Amiri, S. Mohsen
    Pourazad, Mahsa T.
    Nasiopoulos, Panos
    Leung, Victor C. M.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 2368 - 2372
  • [39] Recognizing human gestures in videos by modeling the mutual context of body position and hands movement
    Gavrilescu, Mihai
    [J]. MULTIMEDIA SYSTEMS, 2017, 23 (03) : 381 - 393
  • [40] Recognizing human gestures in videos by modeling the mutual context of body position and hands movement
    Mihai Gavrilescu
    [J]. Multimedia Systems, 2017, 23 : 381 - 393