Vision-Based Efficient Robotic Manipulation with a Dual-Streaming Compact Convolutional Transformer

被引:2
|
作者
Guo, Hao [1 ]
Song, Meichao [1 ]
Ding, Zhen [1 ]
Yi, Chunzhi [2 ]
Jiang, Feng [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Harbin Inst Technol, Sch Med & Hlth, Harbin 150001, Peoples R China
关键词
bio-inspired design and control of robots; robotics; reinforcement learning; vision transformer; LEVEL;
D O I
10.3390/s23010515
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Learning from visual observation for efficient robotic manipulation is a hitherto significant challenge in Reinforcement Learning (RL). Although the collocation of RL policies and convolution neural network (CNN) visual encoder achieves high efficiency and success rate, the method general performance for multi-tasks is still limited to the efficacy of the encoder. Meanwhile, the increasing cost of the encoder optimization for general performance could debilitate the efficiency advantage of the original policy. Building on the attention mechanism, we design a robotic manipulation method that significantly improves the policy general performance among multitasks with the lite Transformer based visual encoder, unsupervised learning, and data augmentation. The encoder of our method could achieve the performance of the original Transformer with much less data, ensuring efficiency in the training process and intensifying the general multi-task performances. Furthermore, we experimentally demonstrate that the master view outperforms the other alternative third-person views in the general robotic manipulation tasks when combining the third-person and egocentric views to assimilate global and local visual information. After extensively experimenting with the tasks from the OpenAI Gym Fetch environment, especially in the Push task, our method succeeds in 92% versus baselines that of 65%, 78% for the CNN encoder, 81% for the ViT encoder, and with fewer training steps.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Vision-based stratified robotic manipulation
    Wei, YJ
    Skaar, SB
    Goodwine, B
    [J]. 2002 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-3, PROCEEDINGS, 2002, : 1638 - 1644
  • [2] Reward Machines for Vision-Based Robotic Manipulation
    Camacho, Alberto
    Varley, Jacob
    Deng, Andy
    Jain, Deepali
    Iscen, Atil
    Kalashnikov, Dmitry
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14284 - 14290
  • [3] Vision-Based Robotic Manipulation of Flexible PCBs
    Li, Xiang
    Su, Xing
    Liu, Yun-Hui
    [J]. IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2018, 23 (06) : 2739 - 2749
  • [4] Q-Attention: Enabling Efficient Learning for Vision-Based Robotic Manipulation
    James, Stephen
    Davison, Andrew J.
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 1612 - 1619
  • [5] Intelligent Lighting Control for Vision-Based Robotic Manipulation
    Chen, S. Y.
    Zhang, Jianwei
    Zhang, Houxiang
    Kwok, N. M.
    Li, Y. F.
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2012, 59 (08) : 3254 - 3263
  • [6] Vision-Based Robotic Grasping and Manipulation of USB Wires
    Li, Xiang
    Su, Xing
    Gao, Yuan
    Liu, Yun-Hui
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 3482 - 3487
  • [7] Vision-based robotic cell design for automated waste manipulation
    Lapusan, C.
    Rad, C.
    Hancu, O.
    Brisan, C.
    [J]. 8TH INTERNATIONAL CONFERENCE ON ADVANCED CONCEPTS IN MECHANICAL ENGINEERING, 2018, 444
  • [8] Development of a Vision-Based Robotic Manipulation System for Transferring of Oocytes
    Miao, Shu
    Chen, Dayuan
    Nie, Qiang
    Jiang, Xin
    Sun, Xulin
    Dai, Jianjun
    Liu, Yun-Hui
    Li, Xiang
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 7470 - 7475
  • [9] An Integrated Vision-based Robotic Manipulation System for Sorting Surgical Tools
    Tan, Huan
    Xu, Yi
    Mao, Ying
    Tong, Xianqiao
    Griffin, Weston B.
    Kannan, Balajee
    DeRose, Lynn A.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON TECHNOLOGIES FOR PRACTICAL ROBOT APPLICATIONS (TEPRA), 2015,
  • [10] Vision-based Robotic Grasp Success Determination with Convolutional Neural Network
    Zhang, Hanbo
    Lan, Xuguang
    Zhou, Xinwen
    Wang, Jianji
    Zheng, Nanning
    [J]. 2017 IEEE 7TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2017, : 31 - 36