FINDING GEODESICS ON GRAPHS USING REINFORCEMENT LEARNING

被引:1
|
作者
Kious, Daniel [1 ]
Mailler, Cecile [1 ]
Schapira, Bruno [2 ]
机构
[1] Univ Bath, Dept Math Sci, Bath, Avon, England
[2] Aix Marseille Univ, CNRS, Marseille, France
来源
ANNALS OF APPLIED PROBABILITY | 2022年 / 32卷 / 05期
基金
英国工程与自然科学研究理事会;
关键词
Random walks on graphs; linear reinforcement; reinforcement learning; path formation; generalised Polya urns; RANDOM-WALK;
D O I
10.1214/21-AAP1777
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
It is well known in biology that ants are able to find shortest paths between their nest and the food by successive random explorations, without any mean of communication other than the pheromones they leave behind them. This striking phenomenon has been observed experimentally and modelled by different mean-field reinforcement-learning models in the biology literature. In this paper, we introduce the first probabilistic reinforcement-learning model for this phenomenon. In this model, the ants explore a finite graph in which two nodes are distinguished as the nest and the source of food. The ants perform successive random walks on this graph, starting from the nest and stopping when they first reach the food; the transition probabilities of each random walk depend on the realizations of all previous walks through some dynamic weighting of the graph. We discuss different variants of this model based on different reinforcement rules and show that slight changes in this reinforcement rule can lead to drastically different outcomes. We prove that the ants indeed eventually find the shortest path(s) between their nest and the food in two variants of this model and when the underlying graph is, respectively, any series-parallel graph and a five-edge nonseries-parallel losange graph. Both proofs rely on the electrical network method for random walks on weighted graphs and on Rubin's embedding in continuous time. The proof in the series-parallel cases uses the recursive nature of this family of graphs, while the proof in the seemingly simpler losange case turns out to be quite intricate: it relies on a fine analysis of some stochastic approximation, and on various couplings with standard and generalised Polya urns.
引用
收藏
页码:3889 / 3929
页数:41
相关论文
共 50 条
  • [31] Qualitative Reinforcement Learning to Accelerate Finding An Optimal Policy
    Telgerdi, Fatemeh
    Khalilian, Alireza
    Pouyan, Ali Akbar
    2014 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2014, : 575 - 580
  • [32] The Optimal Path Finding Algorithm Based on Reinforcement Learning
    Khekare, Ganesh
    Verma, Pushpneel
    Dhanre, Urvashi
    Raut, Seema
    Sheikh, Shahrukh
    INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2020, 12 (04): : 1 - 18
  • [33] Finding geodesics joining given points
    Lyle Noakes
    Erchuan Zhang
    Advances in Computational Mathematics, 2022, 48
  • [34] Finding geodesics joining given points
    Noakes, Lyle
    Zhang, Erchuan
    ADVANCES IN COMPUTATIONAL MATHEMATICS, 2022, 48 (04)
  • [35] Reachability Analysis in Stochastic Directed Graphs by Reinforcement Learning
    Possieri, Corrado
    Frasca, Mattia
    Rizzo, Alessandro
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (01) : 462 - 469
  • [36] DRAG: design RNAs as hierarchical graphs with reinforcement learning
    Li, Yichong
    Pan, Xiaoyong
    Shen, Hongbin
    Yang, Yang
    BRIEFINGS IN BIOINFORMATICS, 2025, 26 (02)
  • [37] Adaptive Pattern Matching with Reinforcement Learning for Dynamic Graphs
    Kanezashi, Hiroki
    Suzumura, Toyotaro
    Garcia-Gasulla, Dario
    Oh, Min-hwan
    Matsuoka, Satoshi
    2018 IEEE 25TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2018, : 92 - 101
  • [38] Heterogeneous relational reasoning in knowledge graphs with reinforcement learning
    Saebi, Mandana
    Kreig, Steven
    Zhang, Chuxu
    Jiang, Meng
    Kajdanowicz, Tomasz
    Chawla, Nitesh, V
    INFORMATION FUSION, 2022, 88 : 12 - 21
  • [39] Reinforcement Learning based Train Rescheduling on Event Graphs
    Gorsane, Rihab
    Mestiri, Khalil Gorsan
    Martinez, Daniel Tapia
    Coyette, Vincent
    Makhlouf, Beyrem
    Vienken, Gereon
    Truong, Minh Tri
    Soehlke, Andreas
    Hartleb, Johann
    Kerkeni, Amine
    Sturm, Irene
    Kupper, Michael
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 874 - 879
  • [40] Graph Approximations to Geodesics on Metric Graphs
    Vandaele, Robin
    Saeys, Yvan
    De Bie, Tijl
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7328 - 7334