Neural Dynamic Policies for End-to-End Sensorimotor Learning

被引:0
|
作者
Bahl, Shikhar [1 ]
Mukadam, Mustafa [2 ]
Gupta, Abhinav [1 ]
Pathak, Deepak [1 ]
机构
[1] CMU, Pittsburgh, PA 15213 USA
[2] FAIR, Seattle, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces such as torque, joint angle, or end-effector position. This forces the agent to make decision at each point in training, and hence, limit the scalability to continuous, high-dimensional, and long-horizon tasks. In contrast, research in classical robotics has, for a long time, exploited dynamical systems as a policy representation to learn robot behaviors via demonstrations. These techniques, however, lack the flexibility and generalizability provided by deep learning or deep reinforcement learning and have remained under-explored in such settings. In this work, we begin to close this gap and embed dynamics structure into deep neural network-based policies by reparameterizing action spaces with differential equations. We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space as opposed to prior policy learning methods where action represents the raw control space. The embedded structure allow us to perform end-to-end policy learning under both reinforcement and imitation learning setups. We show that NDPs achieve better or comparable performance to state-of-the-art approaches on many robotic control tasks using both reward-based training and demonstrations. Project video and code are available at: https://shikharbahl.github.io/neural-dynamic-policies/.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] STATISTICAL LEARNING FOR END-TO-END SIMULATIONS
    Vicent, J.
    Verrelst, J.
    Rivera-Caicedo, J. P.
    Sabater, N.
    Munoz-Mari, J.
    Camps-Valls, G.
    Moreno, J.
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 1699 - 1702
  • [22] END-TO-END LEARNING FOR MUSIC AUDIO
    Dieleman, Sander
    Schrauwen, Benjamin
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [23] Managing end-to-end lifecycle of global service policies
    Rosu, D
    Dan, A
    SERVICE-ORIENTED COMPUTING - ICSOC 2005, PROCEEDINGS, 2005, 3826 : 570 - 575
  • [24] Neural End-to-End Self-learning of Visuomotor Skills by Environment Interaction
    Kerzel, Matthias
    Wermter, Stefan
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 27 - 34
  • [25] End-to-end Learning of Semantic Role Labeling Using Recurrent Neural Networks
    Zhou, Jie
    Xu, Wei
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, 2015, : 1127 - 1137
  • [26] Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks
    Junho Jo
    Hyung Il Koo
    Jae Woong Soh
    Nam Ik Cho
    Multimedia Tools and Applications, 2020, 79 : 32137 - 32150
  • [27] Learning a Deep Neural Net Policy for End-to-End Control of Autonomous Vehicles
    Rausch, Viktor
    Hansen, Andreas
    Solowjow, Eugen
    Liu, Chang
    Kreuzer, Edwin
    Hedrick, J. Karl
    2017 AMERICAN CONTROL CONFERENCE (ACC), 2017, : 4914 - 4919
  • [28] The Predictron: End-To-End Learning and Planning
    Silver, David
    van Hasselt, Hado
    Hessel, Matteo
    Schaul, Tom
    Guez, Arthur
    Harley, Tim
    Dulac-Arnold, Gabriel
    Reichert, David
    Rabinowitz, Neil
    Barret, Andre
    Degris, Thomas
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [29] End-to-end Learning for Graph Decomposition
    Song, Jie
    Andres, Bjoern
    Black, Michael J.
    Hilliges, Otmar
    Tang, Siyu
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 10092 - 10101
  • [30] Amharic OCR: An End-to-End Learning
    Belay, Birhanu
    Habtegebrial, Tewodros
    Meshesha, Million
    Liwicki, Marcus
    Belay, Gebeyehu
    Stricker, Didier
    APPLIED SCIENCES-BASEL, 2020, 10 (03):