Ordinal Decision Models for Markov Decision Processes

被引:7
|
作者
Weng, Paul [1 ]
机构
[1] UPMC, LIP6, Paris, France
关键词
D O I
10.3233/978-1-61499-098-7-828
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Setting the values of rewards in Markov decision processes (MDP) may be a difficult task. In this paper, we consider two ordinal decision models for MDPs where only an order is known over rewards. The first one, which has been proposed recently in MDPs [23], defines preferences with respect to a reference point. The second model, which can been viewed as the dual approach of the first one, is based on quantiles. Based on the first decision model, we give a new interpretation of rewards in standard MDPs, which sheds some interesting light on the preference system used in standard MDPs. The second model based on quantile optimization is a new approach in MDPs with ordinal rewards. Although quantile-based optimality is state-dependent, we prove that an optimal stationary deterministic policy exists for a given initial state. Finally, we propose solution methods based on linear programming for optimizing quantiles.
引用
收藏
页码:828 / 833
页数:6
相关论文
共 50 条
  • [31] SEMI-MARKOV DECISION-PROCESSES AND THEIR APPLICATIONS IN REPLACEMENT MODELS
    KURANO, M
    [J]. JOURNAL OF THE OPERATIONS RESEARCH SOCIETY OF JAPAN, 1985, 28 (01) : 18 - 30
  • [32] Decomposition methods for solving Markov decision processes with multiple models of the parameters
    Steimle, Lauren N.
    Ahluwalia, Vinayak S.
    Kamdar, Charmee
    Denton, Brian T.
    [J]. IISE TRANSACTIONS, 2021, 53 (12) : 1295 - 1310
  • [33] On the Development of Voter Transition Models for Social Choice Markov Decision Processes
    Garcia, David
    Riedl, Anton
    [J]. 2013 IEEE 25TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2013, : 811 - 817
  • [34] Extending Persistent Monitoring by Combining Ocean Models and Markov Decision Processes
    Al-Sabban, Wesam H.
    Gonzalez, Luis F.
    Smith, Ryan N.
    [J]. 2012 OCEANS, 2012,
  • [35] Preference Planning for Markov Decision Processes
    Li, Meilun
    She, Zhikun
    Turrini, Andrea
    Zhang, Lijun
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3313 - 3319
  • [36] On Markov policies for minimax decision processes
    Iwamoto, S
    Tsurusaki, K
    [J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2001, 253 (01) : 58 - 78
  • [37] Active Exploration in Markov Decision Processes
    Tarbouriech, Jean
    Lazaric, Alessandro
    [J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [38] Markov Decision Processes with Applications to Finance
    McAuliffe, Jon
    [J]. QUANTITATIVE FINANCE, 2012, 12 (01) : 15 - 16
  • [39] Probabilistic Hyperproperties of Markov Decision Processes
    Dimitrova, Rayna
    Finkbeiner, Bernd
    Torfah, Hazem
    [J]. AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS (ATVA 2020), 2020, 12302 : 484 - 500
  • [40] Mean Field Markov Decision Processes
    Baeuerle, Nicole
    [J]. APPLIED MATHEMATICS AND OPTIMIZATION, 2023, 88 (01):