Neural Dynamic Policies for End-to-End Sensorimotor Learning

被引：0

作者：

Bahl, Shikhar ^{[1
]}

Mukadam, Mustafa ^{[2
]}

Gupta, Abhinav ^{[1
]}

Pathak, Deepak ^{[1
]}

机构：

[1] CMU, Pittsburgh, PA 15213 USA

[2] FAIR, Seattle, WA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020 | 2020年 / 33卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces such as torque, joint angle, or end-effector position. This forces the agent to make decision at each point in training, and hence, limit the scalability to continuous, high-dimensional, and long-horizon tasks. In contrast, research in classical robotics has, for a long time, exploited dynamical systems as a policy representation to learn robot behaviors via demonstrations. These techniques, however, lack the flexibility and generalizability provided by deep learning or deep reinforcement learning and have remained under-explored in such settings. In this work, we begin to close this gap and embed dynamics structure into deep neural network-based policies by reparameterizing action spaces with differential equations. We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space as opposed to prior policy learning methods where action represents the raw control space. The embedded structure allow us to perform end-to-end policy learning under both reinforcement and imitation learning setups. We show that NDPs achieve better or comparable performance to state-of-the-art approaches on many robotic control tasks using both reward-based training and demonstrations. Project video and code are available at: https://shikharbahl.github.io/neural-dynamic-policies/.

引用

页数：12

共 50 条

[41] Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks
Jo, Junho
Koo, Hyung Il
Soh, Jae Woong
Cho, Nam Ik
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (43-44) : 32137 - 32150
[42] End-to-end human inspired learning based system for dynamic obstacle avoidance
S. M. Haider Jafri
Rahul Kala
Complex & Intelligent Systems, 2022, 8 : 5065 - 5086
[43] End-to-end human inspired learning based system for dynamic obstacle avoidance
Jafri, S. M. Haider
Kala, Rahul
COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (06) : 5065 - 5086
[44] An End-to-End Deep Learning Method for Dynamic Job Shop Scheduling Problem
Chen, Shifan
Huang, Zuyi
Guo, Hongfei
MACHINES, 2022, 10 (07)
[45] End-to-end Neural Information Status Classification
Hou, Yufang
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1377 - 1388
[46] End-to-End Neural Text Classification for Tibetan
Qun, Nuo
Li, Xing
Qiu, Xipeng
Huang, Xuanjing
CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2017, 2017, 10565 : 472 - 480
[47] End-to-end neural event coreference resolution
Lu, Yaojie
Lin, Hongyu
Tang, Jialong
Han, Xianpei
Sun, Le
ARTIFICIAL INTELLIGENCE, 2022, 303
[48] SoundStream: An End-to-End Neural Audio Codec
Zeghidour, Neil
Luebs, Alejandro
Omran, Ahmed
Skoglund, Jan
Tagliasacchi, Marco
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 495 - 507
[49] End-to-end Interpretable Neural Motion Planner
Zeng, Wenyuan
Luo, Wenjie
Suo, Simon
Sadat, Abbas
Yang, Bin
Casas, Sergio
Urtasun, Raquel
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8652 - 8661
[50] Contextualized End-to-End Neural Entity Linking
Chen, Haotian
Zukov-Gregoric, Andrej
Li, Xi
Wadhwa, Sahil
1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 637 - 642

← 1 2 3 4 5 →