A learning search algorithm with propagational reinforcement learning

被引:0
|
作者
Zhang, Wei [1 ]
机构
[1] Northeastern Univ, Sch Comp Sci & Engn, Shenyang, Peoples R China
关键词
Machine learning; Heuristic search; Reinforcement learning; Deep neural network; Deep learning; Learning search;
D O I
10.1007/s10489-020-02117-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When reinforcement learning with a deep neural network is applied to heuristic search, the search becomes a learning search. In a learning search system, there are two key components: (1) a deep neural network with sufficient expression ability as a heuristic function approximator that estimates the distance from any state to a goal; (2) a strategy to guide the interaction of an agent with its environment to obtain more efficient simulated experience to update the Q-value or V-value function of reinforcement learning. To date, neither component has been sufficiently discussed. This study theoretically discusses the size of a deep neural network for approximating a product function of p piecewise multivariate linear functions. The existence of such a deep neural network with O(n + p) layers and O(dn + dnp + dp) neurons has been proven, where d is the number of variables of the multivariate function being approximated, is the approximation error, and n = O(p + log(2)(pd/)). For the second component, this study proposes a general propagational reinforcement-learning-based learning search method that improves the estimate h(.) according to the newly observed distance information about the goals, propagates the improvement bidirectionally in the search tree, and consequently obtains a sequence of more accurate V-values for a sequence of states. Experiments on the maze problems show that our method increases the convergence rate of reinforcement learning by a factor of 2.06 and reduces the number of learning episodes to 1/4 that of other nonpropagating methods.
引用
收藏
页码:7990 / 8009
页数:20
相关论文
共 50 条
  • [1] A learning search algorithm with propagational reinforcement learning
    Wei Zhang
    [J]. Applied Intelligence, 2021, 51 : 7990 - 8009
  • [2] A Direct Policy-Search Algorithm for Relational Reinforcement Learning
    Sarjant, Samuel
    Pfahringer, Bernhard
    Driessens, Kurt
    Smith, Tony
    [J]. INDUCTIVE LOGIC PROGRAMMING: 23RD INTERNATIONAL CONFERENCE, 2014, 8812 : 76 - 92
  • [3] Reinforcement Learning based Search (RLS) algorithm in Social Networks
    Peyravi, Farzad
    Derhami, Vali
    Latif, Alimohammad
    [J]. 2015 INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2015, : 206 - 210
  • [4] A reinforcement learning algorithm to improve scheduling search heuristics with the SVM
    Gersmann, K
    Hammer, B
    [J]. 2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 1811 - 1816
  • [5] On the Search for Feedback in Reinforcement Learning
    Wang, Ran
    Parunandi, Karthikeya S.
    Sharma, Aayushman
    Goyal, Raman
    Chakravorty, Suman
    [J]. 2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 1560 - 1567
  • [6] Localizing search in reinforcement learning
    Grudic, G
    Ungar, L
    [J]. SEVENTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-2001) / TWELFTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-2000), 2000, : 590 - 595
  • [7] Optimization of heuristic search using recursive algorithm selection and reinforcement learning
    Vasilikos, Vasileios
    Lagoudakis, Michail G.
    [J]. ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2010, 60 (1-2) : 119 - 151
  • [8] Optimization of heuristic search using recursive algorithm selection and reinforcement learning
    Vasileios Vasilikos
    Michail G. Lagoudakis
    [J]. Annals of Mathematics and Artificial Intelligence, 2010, 60 : 119 - 151
  • [9] RLIRank: Learning to Rank with Reinforcement Learning for Dynamic Search
    Zhou, Jianghong
    Agichtein, Eugene
    [J]. WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 2842 - 2848
  • [10] A genetic algorithm for reinforcement learning
    Zhao, L
    Liu, ZM
    [J]. ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 1056 - 1060