Visual Affordance Prediction for Guiding Robot Exploration

被引:2
|
作者
Bharadhwaj, Homanga [1 ]
Gupta, Abhinav [1 ]
Tulsiani, Shubham [1 ]
机构
[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA
来源
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA | 2023年
关键词
D O I
10.1109/ICRA48891.2023.10161288
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Motivated by the intuitive understanding humans have about the space of possible interactions, and the ease with which they can generalize this understanding to previously unseen scenes, we develop an approach for learning 'visual affordances'. Given an input image of a scene, we infer a distribution over plausible future states that can be achieved via interactions with it. To allow predicting diverse plausible futures, we discretize the space of continuous images with a VQ-VAE and use a Transformer-based model to learn a conditional distribution in the latent embedding space. We show that these models can be trained using large-scale and diverse passive data, and that the learned models exhibit compositional generalization to diverse objects beyond the training distribution. We evaluate the quality and diversity of the generations, and demonstrate how the trained affordance model can be used for guiding exploration during visual goal-conditioned policy learning in robotic manipulation.
引用
收藏
页码:3029 / 3036
页数:8
相关论文
共 50 条
  • [1] Learning Visual Object Categories for Robot Affordance Prediction
    Sun, Jie
    Moore, Joshua L.
    Bobick, Aaron
    Rehg, James M.
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2010, 29 (2-3): : 174 - 197
  • [2] Active Affordance Exploration for Robot Grasping
    Liu, Huaping
    Yuan, Yuan
    Deng, Yuhong
    Guo, Xiaofeng
    Wei, Yixuan
    Lu, Kai
    Fang, Bin
    Guo, Di
    Sun, Fuchun
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PT V, 2019, 11744 : 426 - 438
  • [3] Robot's affordance prediction based on the subtask
    School of Computer Science and Engineering, South China University of Technology, Guangzhou
    510006, China
    不详
    510006, China
    Huazhong Ligong Daxue Xuebao, (412-415 and 419):
  • [4] GUIDING A ROBOT BY VISUAL FEEDBACK IN ASSEMBLING TASKS
    SHIRAI, Y
    INOUE, H
    PATTERN RECOGNITION, 1973, 5 (02) : 99 - &
  • [5] An affordance field for guiding movement and cognition
    Glenberg, AM
    Cowart, MR
    Kaschak, MP
    BEHAVIORAL AND BRAIN SCIENCES, 2001, 24 (01) : 43 - +
  • [6] Towards affordance detection for robot manipulation using affordance for parts and parts for affordance
    Lakani, Safoura Rezapour
    Rodriguez-Sanchez, Antonio J.
    Piater, Justus
    AUTONOMOUS ROBOTS, 2019, 43 (05) : 1155 - 1172
  • [7] Text-driven object affordance for guiding grasp-type recognition in multimodal robot teaching
    Naoki Wake
    Daichi Saito
    Kazuhiro Sasabuchi
    Hideki Koike
    Katsushi Ikeuchi
    Machine Vision and Applications, 2023, 34
  • [8] Towards affordance detection for robot manipulation using affordance for parts and parts for affordance
    Safoura Rezapour Lakani
    Antonio J. Rodríguez-Sánchez
    Justus Piater
    Autonomous Robots, 2019, 43 : 1155 - 1172
  • [9] Text-driven object affordance for guiding grasp-type recognition in multimodal robot teaching
    Wake, Naoki
    Saito, Daichi
    Sasabuchi, Kazuhiro
    Koike, Hideki
    Ikeuchi, Katsushi
    MACHINE VISION AND APPLICATIONS, 2023, 34 (04)
  • [10] Affordance-based modeling of a human-robot cooperative system for area exploration
    Jeongsik Kim
    Jungmok Ma
    Namhun Kim
    Journal of Mechanical Science and Technology, 2020, 34 : 877 - 887