Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations

被引:1
|
作者
Heravi, Negin [1 ,2 ]
Wahid, Ayzaan [3 ]
Lynch, Corey [3 ]
Florence, Pete [3 ]
Armstrong, Travis [3 ]
Tompson, Jonathan [3 ]
Sermanet, Pierre [3 ]
Bohg, Jeannette [2 ]
Dwibedi, Debidatta [3 ]
机构
[1] Google, Mountain View, CA 94043 USA
[2] Stanford Univ, Stanford, CA 94305 USA
[3] Google, Robot, Mountain View, CA 94043 USA
关键词
D O I
10.1109/ICRA48891.2023.10160888
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Perceptual understanding of the scene and the relationship between its different components is important for successful completion of robotic tasks. Representation learning has been shown to be a powerful technique for this, but most of the current methodologies learn task specific representations that do not necessarily transfer well to other tasks. Furthermore, representations learned by supervised methods require large, labeled datasets for each task that are expensive to collect in the real-world. Using self-supervised learning to obtain representations from unlabeled data can mitigate this problem. However, current self-supervised representation learning methods are mostly object agnostic, and we demonstrate that the resulting representations are insufficient for general purpose robotics tasks as they fail to capture the complexity of scenes with many components. In this paper, we show the effectiveness of using object-aware representation learning techniques for robotic tasks. Our self-supervised representations are learned by observing the agent freely interacting with different parts of the environment and are queried in two different settings: (i) policy learning and (ii) object location prediction. We show that our model learns control policies in a sample-efficient manner and outperforms state-of-the-art object agnostic techniques as well as methods trained on raw RGB images. Our results show a 20% increase in performance in low data regimes (1000 trajectories) in policy training using implicit behavioral cloning (IBC). Furthermore, our method outperforms the baselines for the task of object localization in multi-object scenes. Further qualitative results are available at https://sites.google.com/view/slots4robots.
引用
下载
收藏
页码:9515 / 9522
页数:8
相关论文
共 50 条
  • [1] Object-aware bounding box regression for online multi-object tracking
    Li, Hongli
    Dong, Yongsheng
    Li, Xuelong
    NEUROCOMPUTING, 2023, 518 : 440 - 452
  • [2] Object-aware Semantic Mapping of Indoor Scenes using Octomap
    Liu, Kaijian
    Fan, Zhen
    Liu, Meiqin
    Zhang, Senlin
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 8671 - 8676
  • [3] Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views
    Nanbo, Li
    Eastwood, Cian
    Fisher, Robert B.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [4] Object-Aware Tracking
    Bogun, Ivan
    Ribeiro, Eraldo
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 1695 - 1700
  • [5] Local representations for multi-object recognition
    Deselaers, T
    Keysers, D
    Paredes, R
    Vidal, E
    Ney, H
    PATTERN RECOGNITION, PROCEEDINGS, 2003, 2781 : 305 - 312
  • [6] Object-Aware Domain Generalization for Object Detection
    Lee, Wooju
    Hong, Dasol
    Lim, Hyungtae
    Myung, Hyun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 2947 - 2955
  • [7] Correlation-Aware Object Placement for Multi-Object Operations
    Zhong, Ming
    Shen, Kai
    Seiferas, Joel
    28TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, VOLS 1 AND 2, PROCEEDINGS, 2008, : 512 - 521
  • [8] A Method for Object Recognition and Robot Grasping Detection in Multi-object Scenes
    Zheng, Jiajun
    Zou, Yuanyuan
    Xu, Jie
    Fang, Lingshen
    INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT III, 2022, 13457 : 189 - 196
  • [9] Object-aware Identification of Microservices
    Amiri, Mohammad Javad
    2018 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (IEEE SCC 2018), 2018, : 253 - 256
  • [10] On Similarity of Object-Aware Workflows
    Amiri, Mohammad Javad
    Koupaee, Mahnaz
    Agrawal, Divyakant
    2019 13TH IEEE INTERNATIONAL CONFERENCE ON SERVICE-ORIENTED SYSTEM ENGINEERING (SOSE) / 10TH INTERNATIONAL WORKSHOP ON JOINT CLOUD COMPUTING (JCC) / IEEE INTERNATIONAL WORKSHOP ON CLOUD COMPUTING IN ROBOTIC SYSTEMS (CCRS), 2019, : 84 - 89