SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite

被引:0
|
作者
Song, Shuran [1 ]
Lichtenberg, Samuel P. [1 ]
Xiao, Jianxiong [1 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although RGB-D sensors have enabled major breakthroughs for several vision tasks, such as 3D reconstruction, we have not attained the same level of success in high-level scene understanding. Perhaps one of the main reasons is the lack of a large-scale benchmark with 3D annotations and 3D evaluation metrics. In this paper, we introduce an RGB-D benchmark suite for the goal of advancing the state-of-the-arts in all major scene understanding tasks. Our dataset is captured by four different sensors and contains 10,335 RGB-D images, at a similar scale as PASCAL VOC. The whole dataset is densely annotated and includes 146,617 2D polygons and 64,595 3D bounding boxes with accurate object orientations, as well as a 3D room layout and scene category for each image. This dataset enables us to train data-hungry algorithms for scene-understanding tasks, evaluate them using meaningful 3D metrics, avoid overfitting to a small testing set, and study cross-sensor bias.
引用
收藏
页码:567 / 576
页数:10
相关论文
共 50 条
  • [1] RGB-D scene analysis in the NICU
    Dosso, Yasmina Souley
    Greenwood, Kim
    Harrold, JoAnn
    Green, James R.
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 138
  • [2] Indoor scene understanding via monocular RGB-D images
    Chen, Yanxiang
    Pan, Daru
    Pan, Yifei
    Liu, Shengzhong
    Gu, Aihua
    Wang, Meng
    [J]. INFORMATION SCIENCES, 2015, 320 : 361 - 371
  • [3] A Benchmark for the Evaluation of RGB-D SLAM Systems
    Sturm, Juergen
    Engelhard, Nikolas
    Endres, Felix
    Burgard, Wolfram
    Cremers, Daniel
    [J]. 2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2012, : 573 - 580
  • [4] Crime Scene Reconstruction with RGB-D Sensors
    Amamra, Abdenour
    Amara, Yacine
    Boumaza, Khalid
    Benayad, Aissa
    [J]. PROCEEDINGS OF THE 2019 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2019, : 391 - 396
  • [5] Single RGB-D Fitting: Total Human Modeling with an RGB-D Shot
    Fang, Xianyong
    Yang, Jikui
    Rao, Jie
    Wang, Linbo
    Deng, Zhigang
    [J]. 25TH ACM SYMPOSIUM ON VIRTUAL REALITY SOFTWARE AND TECHNOLOGY (VRST 2019), 2019,
  • [6] Joint Task-Recursive Learning for RGB-D Scene Understanding
    Zhang, Zhenyu
    Cui, Zhen
    Xu, Chunyan
    Jie, Zequn
    Li, Xiang
    Yang, Jian
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) : 2608 - 2623
  • [7] Integrating 3D structure into traffic scene understanding with RGB-D data
    Xia, Yingjie
    Xu, Weiwei
    Zhang, Luming
    Shi, Xingmin
    Mao, Kuang
    [J]. NEUROCOMPUTING, 2015, 151 : 700 - 709
  • [8] A method proposal of scene recognition for RGB-D cameras
    Danciu, Gabriel-Mihail
    [J]. 2016 IEEE 11TH INTERNATIONAL SYMPOSIUM ON APPLIED COMPUTATIONAL INTELLIGENCE AND INFORMATICS (SACI), 2016, : 301 - 304
  • [9] Completed Dense Scene Flow in RGB-D Space
    Wang, Yucheng
    Zhang, Jian
    Liu, Zicheng
    Wu, Qiang
    Chou, Philip
    Zhang, Zhengyou
    Jia, Yunde
    [J]. COMPUTER VISION - ACCV 2014 WORKSHOPS, PT I, 2015, 9008 : 191 - 205
  • [10] Intrinsic Scene Decomposition from RGB-D images
    Hachama, Mohammed
    Ghanem, Bernard
    Wonka, Peter
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 810 - 818