Toward coherent object detection and scene layout understanding

被引:16
|
作者
Bao, Sid Yingze [1 ]
Sun, Min [1 ]
Savarese, Silvio [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48105 USA
关键词
Object detection; Scene layout; Focal length estimation; Supporting surface estimation;
D O I
10.1016/j.imavis.2011.08.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting objects in complex scenes while recovering the scene layout is a critical functionality in many vision-based applications. In this work, we advocate the importance of geometric contextual reasoning for object recognition. We start from the intuition that objects' location and pose in the 3D space are not arbitrarily distributed but rather constrained by the fact that objects must lie on one or multiple supporting surfaces. We model such supporting surfaces by means of hidden parameters (i.e. not explicitly observed) and formulate the problem of joint scene reconstruction and object recognition as the one of finding the set of parameters that maximizes the joint probability of having a number of detected objects on K supporting planes given the observations. As a key ingredient for solving this optimization problem, we have demonstrated a novel relationship between object location and pose in the image, and the scene layout parameters (i.e. normal of one or more supporting planes in 3D and camera pose, location and focal length). Using a novel probabilistic formulation and the above relationship our method has the unique ability to jointly: i) reduce false alarm and false negative object detection rate; ii) recover object location and supporting planes within the 3D camera reference system; iii) infer camera parameters (view point and the focal length) from just one single uncalibrated image. Quantitative and qualitative experimental evaluation on two datasets (desk-top dataset [1] and LabelMe [2]) demonstrates our theoretical claims. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:569 / 579
页数:11
相关论文
共 50 条
  • [31] Indoor Scene Recognition Through Object Detection
    Espinace, P.
    Kollar, T.
    Soto, A.
    Roy, N.
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, : 1406 - 1413
  • [32] Toward Holistic Scene Understanding: A Transfer of Human Scene Perception to Mobile Robots
    Graf, Florenz
    Lindermayr, Jochen
    Odabasi, Cagatay
    Huber, Marco F.
    [J]. IEEE ROBOTICS & AUTOMATION MAGAZINE, 2022, 29 (04) : 36 - 49
  • [33] Adaptive Foreground Object Detection in Railway Scene
    Li, Xing-Xin
    Zhu, Li-Qiang
    Yu, Zu-Jun
    [J]. Jiaotong Yunshu Xitong Gongcheng Yu Xinxi/Journal of Transportation Systems Engineering and Information Technology, 2020, 20 (02): : 83 - 90
  • [34] Properties of Life: Toward a Coherent Understanding of the Organism
    Rosslenbroich, Bernd
    [J]. ACTA BIOTHEORETICA, 2016, 64 (03) : 277 - 307
  • [35] Exploiting Scene Cues for Dropped Object Detection
    Lopez-Mendez, Adolfo
    Monay, Florent
    Odobez, Jean-Marc
    [J]. PROCEEDINGS OF THE 2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, THEORY AND APPLICATIONS (VISAPP 2014), VOL 2, 2014, : 14 - 21
  • [36] Properties of Life: Toward a Coherent Understanding of the Organism
    Bernd Rosslenbroich
    [J]. Acta Biotheoretica, 2016, 64 : 277 - 307
  • [37] Monocular Visual Scene Understanding: Understanding Multi-Object Traffic Scenes
    Wojek, Christian
    Walk, Stefan
    Roth, Stefan
    Schindler, Konrad
    Schiele, Bernt
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (04) : 882 - 897
  • [38] Global scene layout modulates contextual learning in change detection
    Conci, Markus
    Mueller, Hermann J.
    [J]. FRONTIERS IN PSYCHOLOGY, 2014, 5
  • [39] Automated Scene Understanding via Fusion of Image and Object Features
    Khosla, Deepak
    Uhlenbrock, Ryan
    Chen, Yang
    [J]. 2017 IEEE INTERNATIONAL SYMPOSIUM ON TECHNOLOGIES FOR HOMELAND SECURITY (HST), 2017,
  • [40] Learning task-specific object recognition and scene understanding
    Drummond, T
    Caelli, T
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2000, 80 (03) : 315 - 348