Navigating to objects in the real world

被引:18
|
作者
Gervet, Theophile [1 ]
Chintala, Soumith [2 ]
Batra, Dhruv [2 ,3 ]
Malik, Jitendra [2 ,4 ]
Chaplot, Devendra Singh [2 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Meta AI Res, Menlo Pk, CA USA
[3] Georgia Inst Technol, Atlanta, GA USA
[4] Univ Calif Berkeley, Berkeley, CA USA
关键词
SIM2REAL; VISION; ROBOTICS; SYSTEM;
D O I
10.1126/scirobotics.adf6991
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Semantic navigation is necessary to deploy mobile robots in uncontrolled environments such as homes or hospitals. Many learning-based approaches have been proposed in response to the lack of semantic understanding of the classical pipeline for spatial navigation, which builds a geometric map using depth sensors and plans to reach point goals. Broadly, end-to-end learning approaches reactively map sensor inputs to actions with deep neural networks, whereas modular learning approaches enrich the classical pipeline with learning-based semantic sensing and exploration. However, learned visual navigation policies have predominantly been evaluated in sim, with little known about what works on a robot. We present a large-scale empirical study of semantic visual navigation methods comparing representative methods with classical, modular, and end-to-end learning approaches across six homes with no prior experience, maps, or instrumentation. We found that modular learning works well in the real world, attaining a 90% success rate. In contrast, end-to-end learning does not, dropping from 77% sim to a 23% real-world success rate because of a large image domain gap between sim and reality. For practitioners, we show that modular learning is a reliable approach to navigate to objects: Modularity and abstraction in policy design enable sim-to-real transfer. For researchers, we identify two key issues that prevent today's simulators from being reliable evaluation benchmarks-a large sim-to-real gap in images and a disconnect between sim and real-world error modes-and propose concrete steps forward.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Changing the Appearance of Real-World Objects by Modifying Their Surroundings
    Lindlbauer, David
    Muller, Jorg
    Alexa, Marc
    [J]. PROCEEDINGS OF THE 2017 ACM SIGCHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'17), 2017, : 3954 - 3965
  • [32] Learn, detect, and grasp objects in real-world settings
    Vincze, Markus
    Patten, Timothy
    Park, Kiru
    Bauer, Dominik
    [J]. ELEKTROTECHNIK UND INFORMATIONSTECHNIK, 2020, 137 (06): : 324 - 330
  • [33] AN INTERNET OF THINGS PLATFORM FOR REAL-WORLD AND DIGITAL OBJECTS
    De, Suparna
    Elsaleh, Tarek
    Barnaghi, Payam
    Meissner, Stefan
    [J]. SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2012, 13 (01): : 45 - 57
  • [34] IS EXTENSION TO PERCEPTION OF REAL-WORLD OBJECTS AND SCENES POSSIBLE
    WAGEMANS, J
    VERFAILLIE, K
    DEGRAEF, P
    LAMBERTS, K
    [J]. BEHAVIORAL AND BRAIN SCIENCES, 1989, 12 (03) : 415 - 417
  • [35] Aesthetic preferences in the size of images of real-world objects
    Linsen, Sarah
    Leyssen, Mieke H. R.
    Sammartino, Jonathan
    Palmer, Stephen E.
    [J]. PERCEPTION, 2011, 40 (03) : 291 - 298
  • [36] Predictive Indexing for Position Data of Moving Objects in the Real World
    Yanagisawa, Yutaka
    [J]. TRANSACTIONS ON COMPUTATIONAL SCIENCE VI, 2009, 5730 : 77 - 94
  • [37] Internal representations of the canonical real-world distance of objects
    Wang, Yijin
    Gao, Jie
    Zhu, Fuying
    Liu, Xiaoli
    Wang, Gexiu
    Zhang, Yichong
    Deng, Zhiqing
    Chen, Juan
    [J]. JOURNAL OF VISION, 2023, 24 (02): : 14
  • [38] Disentangling visual imagery and perception of real-world objects
    Lee, Sue-Hyun
    Kravitz, Dwight J.
    Baker, Chris I.
    [J]. NEUROIMAGE, 2012, 59 (04) : 4064 - 4073
  • [39] THE INFORMATION USED IN MEMORIAL COMPARISONS OF REAL-WORLD OBJECTS
    LOGIE, RH
    [J]. BULLETIN OF THE BRITISH PSYCHOLOGICAL SOCIETY, 1982, 35 (MAY): : 213 - 213
  • [40] Internal representations of the canonical real-world distance of objects
    Wang, Yijin
    Gao, Jie
    Zhu, Fuying
    Liu, Xiaoli
    Wang, Gexiu
    Zhang, Yichong
    Deng, Zhiqing
    Chen, Juan
    [J]. JOURNAL OF VISION, 2024, 24 (02):