Monocular Visual Scene Understanding: Understanding Multi-Object Traffic Scenes

被引:54
|
作者
Wojek, Christian [1 ]
Walk, Stefan [2 ]
Roth, Stefan [3 ]
Schindler, Konrad [2 ]
Schiele, Bernt [1 ]
机构
[1] Max Planck Inst Informat, D-66123 Saarbrucken, Germany
[2] ETH, Photogrammetry & Remote Sensing Grp, CH-8093 Zurich, Switzerland
[3] Tech Univ Darmstadt, GRIS, D-64283 Darmstadt, Germany
关键词
Scene understanding; tracking; scene tracklets; tracking-by-detection; MCMC; TRACKING; SEGMENTATION;
D O I
10.1109/TPAMI.2012.174
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Following recent advances in detection, context modeling, and tracking, scene understanding has been the focus of renewed interest in computer vision research. This paper presents a novel probabilistic 3D scene model that integrates state-of-the-art multiclass object detection, object tracking and scene labeling together with geometric 3D reasoning. Our model is able to represent complex object interactions such as inter-object occlusion, physical exclusion between objects, and geometric context. Inference in this model allows us to jointly recover the 3D scene context and perform 3D multi-object tracking from a mobile observer, for objects of multiple categories, using only monocular video as input. Contrary to many other approaches, our system performs explicit occlusion reasoning and is therefore capable of tracking objects that are partially occluded for extended periods of time, or objects that have never been observed to their full extent. In addition, we show that a joint scene tracklet model for the evidence collected over multiple frames substantially improves performance. The approach is evaluated for different types of challenging onboard sequences. We first show a substantial improvement to the state of the art in 3D multipeople tracking. Moreover, a similar performance gain is achieved for multiclass 3D tracking of cars and trucks on a challenging dataset.
引用
收藏
页码:882 / 897
页数:16
相关论文
共 50 条
  • [1] Monocular 3D Scene Modeling and Inference: Understanding Multi-Object Traffic Scenes
    Wojek, Christian
    Roth, Stefan
    Schindler, Konrad
    Schiele, Bernt
    [J]. COMPUTER VISION-ECCV 2010, PT IV, 2010, 6314 : 467 - 481
  • [2] UNDERSTANDING OBJECT RELATIONS IN TRAFFIC SCENES
    Hensel, Irina
    Bachmann, Alexander
    Hummel, Britta
    Quan Tran
    [J]. VISAPP 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2010, : 389 - 395
  • [3] Multi-Object Detection in Traffic Scenes Based on Improved SSD
    Wang, Xinqing
    Hua, Xia
    Xiao, Feng
    Li, Yuyang
    Hu, Xiaodong
    Sun, Pengyu
    [J]. ELECTRONICS, 2018, 7 (11)
  • [4] A NOVEL CLASS ACTIVATION MAP FOR VISUAL EXPLANATIONS IN MULTI-OBJECT SCENES
    Wang, Yifan
    Deng, Siyuan
    Yuan, Kunhao
    Schaefer, Gerald
    Liu, Xiyao
    Fang, Hui
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2615 - 2619
  • [5] Understanding visual scenes
    Silberer, Carina
    Uijlings, Jasper
    Lapata, Mirella
    [J]. NATURAL LANGUAGE ENGINEERING, 2018, 24 (03) : 441 - 465
  • [6] BANet: Small and multi-object detection with a bidirectional attention network for traffic scenes
    Wang, Sheng-ye
    Qu, Zhong
    Li, Cui-jin
    Gao, Le-yuan
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [7] Multi-object Monocular SLAM for Dynamic Environments
    Nair, Gokul B.
    Daga, Swapnil
    Sajnani, Rahul
    Ramesh, Anirudha
    Ansari, Junaid Ahmed
    Jatavallabhula, Krishna Murthy
    Krishna, K. Madhava
    [J]. 2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 651 - 657
  • [8] Atomic Scenes for Scalable Traffic Scene Recognition in Monocular Videos
    Chen, Chao-Yeh
    Choi, Wongun
    Chandraker, Manmohan
    [J]. 2016 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2016), 2016,
  • [9] Fast Joint Object Detection and Viewpoint Estimation for Traffic Scene Understanding
    Guindel, Carlos
    Martin, David
    Maria Armingol, Jose
    [J]. IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE, 2018, 10 (04) : 74 - 86
  • [10] SmartMOT: Exploiting the fusion of HDMaps and Multi-Object Tracking for Real-Time scene understanding in Intelligent Vehicles applications
    Gomez-Huelamo, Carlos
    Bergasa, Luis M.
    Gutierrez, Rodrigo
    Felipe Arango, J.
    Diaz, Alejandro
    [J]. 2021 32ND IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2021, : 710 - 715