Strategy synthesis for zero-sum neuro-symbolic concurrent stochastic games

被引:0
|
作者
Yan, Rui [1 ]
Santos, Gabriel [1 ]
Norman, Gethin [1 ,2 ]
Parker, David [1 ]
Kwiatkowska, Marta [1 ]
机构
[1] Univ Oxford, Dept Comp Sci, Oxford OX1 2JD, England
[2] Univ Glasgow, Sch Comp Sci, Glasgow G12 8QQ, Scotland
基金
欧盟地平线“2020”;
关键词
Stochastic games; Neuro-symbolic systems; Value iteration; Policy iteration; Borel state spaces; POLICY ITERATION; MARKOV GAMES;
D O I
10.1016/j.ic.2024.105193
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Neuro-symbolic approaches to artificial intelligence, which combine neural networks with classical symbolic techniques, are growing in prominence, necessitating formal approaches to reason about their correctness. We propose a novel modelling formalism called neurosymbolic concurrent stochastic games (NS-CSGs), which comprise two probabilistic finitestate agents interacting in a shared continuous-state environment. Each agent observes the environment using a neural perception mechanism, which converts inputs such as images into symbolic percepts, and makes decisions symbolically. We focus on the class of NS-CSGs with Borel state spaces and prove the existence and measurability of the value function for zero-sum discounted cumulative rewards under piecewise-constant restrictions. To compute values and synthesise strategies, we first introduce a Borel measurable piecewiseconstant (B-PWC) representation of value functions and propose a B-PWC value iteration. Second, we introduce two novel representations for the value functions and strategies, and propose a minimax-action-free policy iteration based on alternating player choices. (c) 2024 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons .org /licenses /by /4 .0/).
引用
收藏
页数:28
相关论文
共 50 条