FINDING GEODESICS ON GRAPHS USING REINFORCEMENT LEARNING

被引：1

作者：

Kious, Daniel ^{[1
]}

Mailler, Cecile ^{[1
]}

Schapira, Bruno ^{[2
]}

机构：

[1] Univ Bath, Dept Math Sci, Bath, Avon, England

[2] Aix Marseille Univ, CNRS, Marseille, France

来源：

ANNALS OF APPLIED PROBABILITY | 2022年 / 32卷 / 05期

基金：

英国工程与自然科学研究理事会;

关键词：

Random walks on graphs; linear reinforcement; reinforcement learning; path formation; generalised Polya urns; RANDOM-WALK;

D O I：

10.1214/21-AAP1777

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

It is well known in biology that ants are able to find shortest paths between their nest and the food by successive random explorations, without any mean of communication other than the pheromones they leave behind them. This striking phenomenon has been observed experimentally and modelled by different mean-field reinforcement-learning models in the biology literature. In this paper, we introduce the first probabilistic reinforcement-learning model for this phenomenon. In this model, the ants explore a finite graph in which two nodes are distinguished as the nest and the source of food. The ants perform successive random walks on this graph, starting from the nest and stopping when they first reach the food; the transition probabilities of each random walk depend on the realizations of all previous walks through some dynamic weighting of the graph. We discuss different variants of this model based on different reinforcement rules and show that slight changes in this reinforcement rule can lead to drastically different outcomes. We prove that the ants indeed eventually find the shortest path(s) between their nest and the food in two variants of this model and when the underlying graph is, respectively, any series-parallel graph and a five-edge nonseries-parallel losange graph. Both proofs rely on the electrical network method for random walks on weighted graphs and on Rubin's embedding in continuous time. The proof in the series-parallel cases uses the recursive nature of this family of graphs, while the proof in the seemingly simpler losange case turns out to be quite intricate: it relies on a fine analysis of some stochastic approximation, and on various couplings with standard and generalised Polya urns.

引用

页码：3889 / 3929

页数：41

共 50 条

[31] Qualitative Reinforcement Learning to Accelerate Finding An Optimal Policy
Telgerdi, Fatemeh
Khalilian, Alireza
Pouyan, Ali Akbar
2014 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2014, : 575 - 580
[32] The Optimal Path Finding Algorithm Based on Reinforcement Learning
Khekare, Ganesh
Verma, Pushpneel
Dhanre, Urvashi
Raut, Seema
Sheikh, Shahrukh
INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2020, 12 (04): : 1 - 18
[33] Finding geodesics joining given points
Lyle Noakes
Erchuan Zhang
Advances in Computational Mathematics, 2022, 48
[34] Finding geodesics joining given points
Noakes, Lyle
Zhang, Erchuan
ADVANCES IN COMPUTATIONAL MATHEMATICS, 2022, 48 (04)
[35] Reachability Analysis in Stochastic Directed Graphs by Reinforcement Learning
Possieri, Corrado
Frasca, Mattia
Rizzo, Alessandro
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (01) : 462 - 469
[36] DRAG: design RNAs as hierarchical graphs with reinforcement learning
Li, Yichong
Pan, Xiaoyong
Shen, Hongbin
Yang, Yang
BRIEFINGS IN BIOINFORMATICS, 2025, 26 (02)
[37] Adaptive Pattern Matching with Reinforcement Learning for Dynamic Graphs
Kanezashi, Hiroki
Suzumura, Toyotaro
Garcia-Gasulla, Dario
Oh, Min-hwan
Matsuoka, Satoshi
2018 IEEE 25TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2018, : 92 - 101
[38] Heterogeneous relational reasoning in knowledge graphs with reinforcement learning
Saebi, Mandana
Kreig, Steven
Zhang, Chuxu
Jiang, Meng
Kajdanowicz, Tomasz
Chawla, Nitesh, V
INFORMATION FUSION, 2022, 88 : 12 - 21
[39] Reinforcement Learning based Train Rescheduling on Event Graphs
Gorsane, Rihab
Mestiri, Khalil Gorsan
Martinez, Daniel Tapia
Coyette, Vincent
Makhlouf, Beyrem
Vienken, Gereon
Truong, Minh Tri
Soehlke, Andreas
Hartleb, Johann
Kerkeni, Amine
Sturm, Irene
Kupper, Michael
2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 874 - 879
[40] Graph Approximations to Geodesics on Metric Graphs
Vandaele, Robin
Saeys, Yvan
De Bie, Tijl
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7328 - 7334

← 1 2 3 4 5 →