FINDING GEODESICS ON GRAPHS USING REINFORCEMENT LEARNING

被引：1

作者：

Kious, Daniel ^{[1
]}

Mailler, Cecile ^{[1
]}

Schapira, Bruno ^{[2
]}

机构：

[1] Univ Bath, Dept Math Sci, Bath, Avon, England

[2] Aix Marseille Univ, CNRS, Marseille, France

来源：

ANNALS OF APPLIED PROBABILITY | 2022年 / 32卷 / 05期

基金：

英国工程与自然科学研究理事会;

关键词：

Random walks on graphs; linear reinforcement; reinforcement learning; path formation; generalised Polya urns; RANDOM-WALK;

D O I：

10.1214/21-AAP1777

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

It is well known in biology that ants are able to find shortest paths between their nest and the food by successive random explorations, without any mean of communication other than the pheromones they leave behind them. This striking phenomenon has been observed experimentally and modelled by different mean-field reinforcement-learning models in the biology literature. In this paper, we introduce the first probabilistic reinforcement-learning model for this phenomenon. In this model, the ants explore a finite graph in which two nodes are distinguished as the nest and the source of food. The ants perform successive random walks on this graph, starting from the nest and stopping when they first reach the food; the transition probabilities of each random walk depend on the realizations of all previous walks through some dynamic weighting of the graph. We discuss different variants of this model based on different reinforcement rules and show that slight changes in this reinforcement rule can lead to drastically different outcomes. We prove that the ants indeed eventually find the shortest path(s) between their nest and the food in two variants of this model and when the underlying graph is, respectively, any series-parallel graph and a five-edge nonseries-parallel losange graph. Both proofs rely on the electrical network method for random walks on weighted graphs and on Rubin's embedding in continuous time. The proof in the series-parallel cases uses the recursive nature of this family of graphs, while the proof in the seemingly simpler losange case turns out to be quite intricate: it relies on a fine analysis of some stochastic approximation, and on various couplings with standard and generalised Polya urns.

引用

页码：3889 / 3929

页数：41

共 50 条

[21] Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs
Kuyer, Lior
Whiteson, Shimon
Bakker, Bram
Vlassis, Nikos
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART I, PROCEEDINGS, 2008, 5211 : 656 - +
[22] Reconfiguring Unbalanced Distribution Networks using Reinforcement Learning over Graphs
Jacob, Roshni Anna
Paul, Steve
Li, Wenyuan
Chowdhury, Souma
Gel, Yulia R.
Zhang, Jie
2022 IEEE TEXAS POWER AND ENERGY CONFERENCE (TPEC), 2021, : 127 - 132
[23] A Method for Finding Multiple Subgoals for Reinforcement Learning
Ogihara, Fuminori
Murata, Junichi
PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 16TH '11), 2011, : 804 - 807
[24] High-Speed Racing Reinforcement Learning Network: Learning the Environment Using Scene Graphs
Shi, Jingjing
Li, Ruiqin
Yu, Daguo
IEEE ACCESS, 2024, 12 : 116771 - 116785
[25] Cherrypick: Solving the Steiner Tree Problem in Graphs using Deep Reinforcement Learning
Yan, Zong
Du, Haizhou
Zhang, Jiahao
Li, Guoqing
PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 35 - 40
[26] Clique-based Cooperative Multiagent Reinforcement Learning Using Factor Graphs
Zhen Zhang
Dongbin Zhao
IEEE/CAAJournalofAutomaticaSinica, 2014, 1 (03) : 248 - 256
[27] Clique-based cooperative multiagent reinforcement learning using factor graphs
Zhang, Zhen (zhangzdlut@gmail.com), 1600, Institute of Electrical and Electronics Engineers Inc. (01):
[28] Properties of geodesics in fuzzy graphs
Bhutani, KR
Mordeson, J
Rosenfeld, A
PROCEEDINGS OF THE 7TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2003, : 206 - 209
[29] Finding the ground state of spin Hamiltonians with reinforcement learning
Mills, Kyle
Ronagh, Pooya
Tamblyn, Isaac
NATURE MACHINE INTELLIGENCE, 2020, 2 (09) : 509 - 517
[30] Finding the ground state of spin Hamiltonians with reinforcement learning
Kyle Mills
Pooya Ronagh
Isaac Tamblyn
Nature Machine Intelligence, 2020, 2 : 509 - 517

← 1 2 3 4 5 →