Learning Generalized Reactive Policies Using Deep Neural Networks

被引:0
|
作者
Groshev, Edward [1 ]
Goldstein, Maxwell [2 ]
Tamar, Aviv [1 ]
Srivastava, Siddharth [3 ,4 ]
Abbeel, Pieter [1 ]
机构
[1] Univ Calif Berkeley, Dept Comp Sci, Berkeley, CA 94720 USA
[2] Princeton Univ, Dept Math, Princeton, NJ 08544 USA
[3] Arizona State Univ, Sch Comp Informat & Decis Syst Engn, Tempe, AZ 85281 USA
[4] United Technol Res Ctr, E Hartford, CT 06108 USA
关键词
PDDL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new approach to learning for planning, where knowledge acquired while solving a given set of planning problems is used to plan faster in related, but new problem instances. We show that a deep neural network can be used to learn and represent a generalized reactive policy (GRP) that maps a problem instance and a state to an action, and that the learned GRPs efficiently solve large classes of challenging problem instances. In contrast to prior efforts in this direction, our approach significantly reduces the dependence of learning on handcrafted domain knowledge or feature selection. Instead, the GRP is trained from scratch using a set of successful execution traces. We show that our approach can also be used to automatically learn a heuristic function that can be used in directed search algorithms. We evaluate our approach using an extensive suite of experiments on two challenging planning problem domains and show that our approach facilitates learning complex decision making policies and powerful heuristic functions with minimal human input. Videos of our results are available at goo.gl/Hpy4e3.
引用
收藏
页码:408 / 416
页数:9
相关论文
共 50 条
  • [1] Selecting and Composing Learning Rate Policies for Deep Neural Networks
    Wu, Yanzhao
    Liu, Ling
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)
  • [2] Learning Deep Architectures via Generalized Whitened Neural Networks
    Luo, Ping
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [3] Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks
    Perez-Dattari, Rodrigo
    Celemin, Carlos
    Ruiz-del-Solar, Javier
    Kober, Jens
    [J]. PROCEEDINGS OF THE 2018 INTERNATIONAL SYMPOSIUM ON EXPERIMENTAL ROBOTICS, 2020, 11 : 353 - 363
  • [4] Using Deep Neural Networks to Simulate Heart Allocation Policies
    Medved, D.
    Nugets, P.
    Ohlsson, M.
    Hoglund, P.
    Andersson, B.
    Nilsson, J.
    [J]. JOURNAL OF HEART AND LUNG TRANSPLANTATION, 2018, 37 (04): : S171 - S172
  • [5] Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks
    Wu, Yanzhao
    Liu, Ling
    Bae, Juhyun
    Chow, Ka-Ho
    Iyengar, Arun
    Pu, Calton
    Wei, Wenqi
    Yu, Lei
    Zhang, Qi
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 1971 - 1980
  • [6] Learning Graph Dynamics using Deep Neural Networks
    Narayan, Apurva
    Roe, Peter H. O'N
    [J]. IFAC PAPERSONLINE, 2018, 51 (02): : 433 - 438
  • [7] Biosignals learning and synthesis using deep neural networks
    Belo, David
    Rodrigues, Joao
    Vaz, Joao R.
    Pezarat-Correia, Pedro
    Gamboa, Hugo
    [J]. BIOMEDICAL ENGINEERING ONLINE, 2017, 16
  • [8] Learning from LDA Using Deep Neural Networks
    Zhang, Dongxu
    Luo, Tianyi
    Wang, Dong
    [J]. NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS (NLPCC 2016), 2016, 10102 : 657 - 664
  • [9] Biosignals learning and synthesis using deep neural networks
    David Belo
    João Rodrigues
    João R. Vaz
    Pedro Pezarat-Correia
    Hugo Gamboa
    [J]. BioMedical Engineering OnLine, 16
  • [10] Simulating the Outcome of Heart Allocation Policies Using Deep Neural Networks
    Medved, Dennis
    Nugues, Pierre
    Nilsson, Johan
    [J]. 2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 6141 - 6144