Learning Generalized Reactive Policies Using Deep Neural Networks

被引：0

作者：

Groshev, Edward ^{[1
]}

Goldstein, Maxwell ^{[2
]}

Tamar, Aviv ^{[1
]}

Srivastava, Siddharth ^{[3
,4
]}

Abbeel, Pieter ^{[1
]}

机构：

[1] Univ Calif Berkeley, Dept Comp Sci, Berkeley, CA 94720 USA

[2] Princeton Univ, Dept Math, Princeton, NJ 08544 USA

[3] Arizona State Univ, Sch Comp Informat & Decis Syst Engn, Tempe, AZ 85281 USA

[4] United Technol Res Ctr, E Hartford, CT 06108 USA

来源：

TWENTY-EIGHTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING (ICAPS 2018) | 2018年

关键词：

PDDL;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a new approach to learning for planning, where knowledge acquired while solving a given set of planning problems is used to plan faster in related, but new problem instances. We show that a deep neural network can be used to learn and represent a generalized reactive policy (GRP) that maps a problem instance and a state to an action, and that the learned GRPs efficiently solve large classes of challenging problem instances. In contrast to prior efforts in this direction, our approach significantly reduces the dependence of learning on handcrafted domain knowledge or feature selection. Instead, the GRP is trained from scratch using a set of successful execution traces. We show that our approach can also be used to automatically learn a heuristic function that can be used in directed search algorithms. We evaluate our approach using an extensive suite of experiments on two challenging planning problem domains and show that our approach facilitates learning complex decision making policies and powerful heuristic functions with minimal human input. Videos of our results are available at goo.gl/Hpy4e3.

引用

页码：408 / 416

页数：9

共 50 条

[1] Selecting and Composing Learning Rate Policies for Deep Neural Networks
Wu, Yanzhao
Liu, Ling
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)
[2] Learning Deep Architectures via Generalized Whitened Neural Networks
Luo, Ping
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[3] Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks
Perez-Dattari, Rodrigo
Celemin, Carlos
Ruiz-del-Solar, Javier
Kober, Jens
[J]. PROCEEDINGS OF THE 2018 INTERNATIONAL SYMPOSIUM ON EXPERIMENTAL ROBOTICS, 2020, 11 : 353 - 363
[4] Using Deep Neural Networks to Simulate Heart Allocation Policies
Medved, D.
Nugets, P.
Ohlsson, M.
Hoglund, P.
Andersson, B.
Nilsson, J.
[J]. JOURNAL OF HEART AND LUNG TRANSPLANTATION, 2018, 37 (04): : S171 - S172
[5] Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks
Wu, Yanzhao
Liu, Ling
Bae, Juhyun
Chow, Ka-Ho
Iyengar, Arun
Pu, Calton
Wei, Wenqi
Yu, Lei
Zhang, Qi
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 1971 - 1980
[6] Learning Graph Dynamics using Deep Neural Networks
Narayan, Apurva
Roe, Peter H. O'N
[J]. IFAC PAPERSONLINE, 2018, 51 (02): : 433 - 438
[7] Biosignals learning and synthesis using deep neural networks
Belo, David
Rodrigues, Joao
Vaz, Joao R.
Pezarat-Correia, Pedro
Gamboa, Hugo
[J]. BIOMEDICAL ENGINEERING ONLINE, 2017, 16
[8] Learning from LDA Using Deep Neural Networks
Zhang, Dongxu
Luo, Tianyi
Wang, Dong
[J]. NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS (NLPCC 2016), 2016, 10102 : 657 - 664
[9] Biosignals learning and synthesis using deep neural networks
David Belo
João Rodrigues
João R. Vaz
Pedro Pezarat-Correia
Hugo Gamboa
[J]. BioMedical Engineering OnLine, 16
[10] Simulating the Outcome of Heart Allocation Policies Using Deep Neural Networks
Medved, Dennis
Nugues, Pierre
Nilsson, Johan
[J]. 2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 6141 - 6144

← 1 2 3 4 5 →