FINDING GEODESICS ON GRAPHS USING REINFORCEMENT LEARNING

被引:1
|
作者
Kious, Daniel [1 ]
Mailler, Cecile [1 ]
Schapira, Bruno [2 ]
机构
[1] Univ Bath, Dept Math Sci, Bath, Avon, England
[2] Aix Marseille Univ, CNRS, Marseille, France
来源
ANNALS OF APPLIED PROBABILITY | 2022年 / 32卷 / 05期
基金
英国工程与自然科学研究理事会;
关键词
Random walks on graphs; linear reinforcement; reinforcement learning; path formation; generalised Polya urns; RANDOM-WALK;
D O I
10.1214/21-AAP1777
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
It is well known in biology that ants are able to find shortest paths between their nest and the food by successive random explorations, without any mean of communication other than the pheromones they leave behind them. This striking phenomenon has been observed experimentally and modelled by different mean-field reinforcement-learning models in the biology literature. In this paper, we introduce the first probabilistic reinforcement-learning model for this phenomenon. In this model, the ants explore a finite graph in which two nodes are distinguished as the nest and the source of food. The ants perform successive random walks on this graph, starting from the nest and stopping when they first reach the food; the transition probabilities of each random walk depend on the realizations of all previous walks through some dynamic weighting of the graph. We discuss different variants of this model based on different reinforcement rules and show that slight changes in this reinforcement rule can lead to drastically different outcomes. We prove that the ants indeed eventually find the shortest path(s) between their nest and the food in two variants of this model and when the underlying graph is, respectively, any series-parallel graph and a five-edge nonseries-parallel losange graph. Both proofs rely on the electrical network method for random walks on weighted graphs and on Rubin's embedding in continuous time. The proof in the series-parallel cases uses the recursive nature of this family of graphs, while the proof in the seemingly simpler losange case turns out to be quite intricate: it relies on a fine analysis of some stochastic approximation, and on various couplings with standard and generalised Polya urns.
引用
收藏
页码:3889 / 3929
页数:41
相关论文
共 50 条
  • [21] Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs
    Kuyer, Lior
    Whiteson, Shimon
    Bakker, Bram
    Vlassis, Nikos
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART I, PROCEEDINGS, 2008, 5211 : 656 - +
  • [22] Reconfiguring Unbalanced Distribution Networks using Reinforcement Learning over Graphs
    Jacob, Roshni Anna
    Paul, Steve
    Li, Wenyuan
    Chowdhury, Souma
    Gel, Yulia R.
    Zhang, Jie
    2022 IEEE TEXAS POWER AND ENERGY CONFERENCE (TPEC), 2021, : 127 - 132
  • [23] A Method for Finding Multiple Subgoals for Reinforcement Learning
    Ogihara, Fuminori
    Murata, Junichi
    PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 16TH '11), 2011, : 804 - 807
  • [24] High-Speed Racing Reinforcement Learning Network: Learning the Environment Using Scene Graphs
    Shi, Jingjing
    Li, Ruiqin
    Yu, Daguo
    IEEE ACCESS, 2024, 12 : 116771 - 116785
  • [25] Cherrypick: Solving the Steiner Tree Problem in Graphs using Deep Reinforcement Learning
    Yan, Zong
    Du, Haizhou
    Zhang, Jiahao
    Li, Guoqing
    PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 35 - 40
  • [26] Clique-based Cooperative Multiagent Reinforcement Learning Using Factor Graphs
    Zhen Zhang
    Dongbin Zhao
    IEEE/CAAJournalofAutomaticaSinica, 2014, 1 (03) : 248 - 256
  • [27] Clique-based cooperative multiagent reinforcement learning using factor graphs
    Zhang, Zhen (zhangzdlut@gmail.com), 1600, Institute of Electrical and Electronics Engineers Inc. (01):
  • [28] Properties of geodesics in fuzzy graphs
    Bhutani, KR
    Mordeson, J
    Rosenfeld, A
    PROCEEDINGS OF THE 7TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2003, : 206 - 209
  • [29] Finding the ground state of spin Hamiltonians with reinforcement learning
    Mills, Kyle
    Ronagh, Pooya
    Tamblyn, Isaac
    NATURE MACHINE INTELLIGENCE, 2020, 2 (09) : 509 - 517
  • [30] Finding the ground state of spin Hamiltonians with reinforcement learning
    Kyle Mills
    Pooya Ronagh
    Isaac Tamblyn
    Nature Machine Intelligence, 2020, 2 : 509 - 517