Strategy synthesis for zero-sum neuro-symbolic concurrent stochastic games

被引：0

作者：

Yan, Rui ^{[1
]}

Santos, Gabriel ^{[1
]}

Norman, Gethin ^{[1
,2
]}

Parker, David ^{[1
]}

Kwiatkowska, Marta ^{[1
]}

机构：

[1] Univ Oxford, Dept Comp Sci, Oxford OX1 2JD, England

[2] Univ Glasgow, Sch Comp Sci, Glasgow G12 8QQ, Scotland

来源：

INFORMATION AND COMPUTATION | 2024年 / 300卷

基金：

欧盟地平线“2020”;

关键词：

Stochastic games; Neuro-symbolic systems; Value iteration; Policy iteration; Borel state spaces; POLICY ITERATION; MARKOV GAMES;

D O I：

10.1016/j.ic.2024.105193

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Neuro-symbolic approaches to artificial intelligence, which combine neural networks with classical symbolic techniques, are growing in prominence, necessitating formal approaches to reason about their correctness. We propose a novel modelling formalism called neurosymbolic concurrent stochastic games (NS-CSGs), which comprise two probabilistic finitestate agents interacting in a shared continuous-state environment. Each agent observes the environment using a neural perception mechanism, which converts inputs such as images into symbolic percepts, and makes decisions symbolically. We focus on the class of NS-CSGs with Borel state spaces and prove the existence and measurability of the value function for zero-sum discounted cumulative rewards under piecewise-constant restrictions. To compute values and synthesise strategies, we first introduce a Borel measurable piecewiseconstant (B-PWC) representation of value functions and propose a B-PWC value iteration. Second, we introduce two novel representations for the value functions and strategies, and propose a minimax-action-free policy iteration based on alternating player choices. (c) 2024 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons .org /licenses /by /4 .0/).

引用

页数：28

共 50 条

[21] On the Minimax Principle and Zero-Sum Stochastic Differential Games
Ho, Y. C.
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1974, 13 (03) : 343 - 361
[22] UNIVERSALLY MEASURABLE STRATEGIES IN ZERO-SUM STOCHASTIC GAMES
NOWAK, AS
ANNALS OF PROBABILITY, 1985, 13 (01): : 269 - 287
[23] Information Structures and Values in Zero-Sum Stochastic Games
Nayyar, Ashutosh
Gupta, Abhishek
2017 AMERICAN CONTROL CONFERENCE (ACC), 2017, : 3658 - 3663
[24] General limit value in zero-sum stochastic games
Bruno Ziliotto
International Journal of Game Theory, 2016, 45 : 353 - 374
[25] On the optimality equation for zero-sum ergodic stochastic, games
Jaśkiewicz A.
Nowak A.S.
Mathematical Methods of Operations Research, 2001, 54 (2) : 291 - 301
[26] Almost Stationary ∈-Equilibria in Zero-Sum Stochastic Games
J. Flesch
F. Thuijsman
O. J. Vrieze
Journal of Optimization Theory and Applications, 2000, 105 : 371 - 389
[27] Zero-sum risk-sensitive stochastic games
Bauerle, Nicole
Rieder, Ulrich
STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 2017, 127 (02) : 622 - 642
[28] Zero-sum stochastic differential games with reflecting diffusions
Ghosh, MK
Kumar, KS
COMPUTATIONAL & APPLIED MATHEMATICS, 1997, 16 (03): : 237 - 246
[29] STOCHASTIC NONSTATIONARY 2 PERSON ZERO-SUM GAMES
SCHAL, M
ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1981, 61 (05): : T352 - T353
[30] REVERSIBILITY AND OSCILLATIONS IN ZERO-SUM DISCOUNTED STOCHASTIC GAMES
Sorin, Sylvain
Vigeral, Guillaume
JOURNAL OF DYNAMICS AND GAMES, 2015, 2 (01): : 103 - 115

← 1 2 3 4 5 →