Optimal Control of Logically Constrained Partially Observable and Multiagent Markov Decision Processes

被引:0
|
作者
Kalagarla, Krishna C. [1 ,2 ]
Kartik, Dhruva [1 ,3 ]
Shen, Dongming [1 ,4 ]
Jain, Rahul [1 ]
Nayyar, Ashutosh [1 ]
Nuzzo, Pierluigi [1 ]
机构
[1] Univ Southern Calif, Ming Hsieh Dept Elect & Comp Engn, Los Angeles, CA 90089 USA
[2] Univ New Mexico, Elect & Comp Engn Dept, Albuquerque, NM 87106 USA
[3] Amazon, Seattle, WA 98121 USA
[4] MIT Sloan Sch Management, Cambridge, MA USA
关键词
Logic; Planning; Robots; Optimal control; Markov decision processes; Task analysis; Stochastic processes; Markov decision processes (MDPs); multiagent systems; partially observable Markov decision processes (POMDPs); stochastic optimal control; temporal logic;
D O I
10.1109/TAC.2024.3422213
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Autonomous systems often have logical constraints arising, for example, from safety, operational, or regulatory requirements. Such constraints can be expressed using temporal logic specifications. The system state is often partially observable. Moreover, it could encompass a team of multiple agents with a common objective but disparate information structures and constraints. In this article, we first introduce an optimal control theory for partially observable Markov decision processes with finite linear temporal logic constraints. We provide a structured methodology for synthesizing policies that maximize a cumulative reward while ensuring that the probability of satisfying a temporal logic constraint is sufficiently high. Our approach comes with guarantees on approximate reward optimality and constraint satisfaction. We then build on this approach to design an optimal control framework for logically constrained multiagent settings with information asymmetry. We illustrate the effectiveness of our approach by implementing it on several case studies.
引用
收藏
页码:263 / 277
页数:15
相关论文
共 50 条
  • [21] Transition Entropy in Partially Observable Markov Decision Processes
    Melo, Francisco S.
    Ribeiro, Isabel
    INTELLIGENT AUTONOMOUS SYSTEMS 9, 2006, : 282 - +
  • [22] Partially observable Markov decision processes with reward information
    Cao, XR
    Guo, XP
    2004 43RD IEEE CONFERENCE ON DECISION AND CONTROL (CDC), VOLS 1-5, 2004, : 4393 - 4398
  • [23] Partially Observable Markov Decision Processes in Robotics: A Survey
    Lauri, Mikko
    Hsu, David
    Pajarinen, Joni
    IEEE TRANSACTIONS ON ROBOTICS, 2023, 39 (01) : 21 - 40
  • [24] A primer on partially observable Markov decision processes (POMDPs)
    Chades, Iadine
    Pascal, Luz V.
    Nicol, Sam
    Fletcher, Cameron S.
    Ferrer-Mestres, Jonathan
    METHODS IN ECOLOGY AND EVOLUTION, 2021, 12 (11): : 2058 - 2072
  • [25] Partially observable Markov decision processes with imprecise parameters
    Itoh, Hideaki
    Nakamura, Kiyohiko
    ARTIFICIAL INTELLIGENCE, 2007, 171 (8-9) : 453 - 490
  • [26] Minimal Disclosure in Partially Observable Markov Decision Processes
    Bertrand, Nathalie
    Genest, Blaise
    IARCS ANNUAL CONFERENCE ON FOUNDATIONS OF SOFTWARE TECHNOLOGY AND THEORETICAL COMPUTER SCIENCE (FSTTCS 2011), 2011, 13 : 411 - 422
  • [27] Nonapproximability results for partially observable Markov decision processes
    Lusena, Cristopher
    Goldsmith, Judy
    Mundhenk, Martin
    1600, Morgan Kaufmann Publishers (14):
  • [28] Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes
    Sharma, Rajneesh
    Spaan, Matthijs T. J.
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1422 - 1429
  • [29] Control limits for two-state partially observable Markov decision processes
    Grosfeld-Nir, Abraham
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 182 (01) : 300 - 304
  • [30] An optimal policy for partially observable Markov decision processes with non-independent monitors
    Jin, Lu
    Mashita, Tomoaki
    Suzuki, Kazuyuki
    JOURNAL OF QUALITY IN MAINTENANCE ENGINEERING, 2005, 11 (03) : 228 - +