Ordinal Decision Models for Markov Decision Processes

被引：7

作者：

Weng, Paul ^{[1
]}

机构：

[1] UPMC, LIP6, Paris, France

来源：

20TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2012) | 2012年 / 242卷

关键词：

D O I：

10.3233/978-1-61499-098-7-828

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Setting the values of rewards in Markov decision processes (MDP) may be a difficult task. In this paper, we consider two ordinal decision models for MDPs where only an order is known over rewards. The first one, which has been proposed recently in MDPs [23], defines preferences with respect to a reference point. The second model, which can been viewed as the dual approach of the first one, is based on quantiles. Based on the first decision model, we give a new interpretation of rewards in standard MDPs, which sheds some interesting light on the preference system used in standard MDPs. The second model based on quantile optimization is a new approach in MDPs with ordinal rewards. Although quantile-based optimality is state-dependent, we prove that an optimal stationary deterministic policy exists for a given initial state. Finally, we propose solution methods based on linear programming for optimizing quantiles.

引用

页码：828 / 833

页数：6

共 50 条

[31] SEMI-MARKOV DECISION-PROCESSES AND THEIR APPLICATIONS IN REPLACEMENT MODELS
KURANO, M
[J]. JOURNAL OF THE OPERATIONS RESEARCH SOCIETY OF JAPAN, 1985, 28 (01) : 18 - 30
[32] Decomposition methods for solving Markov decision processes with multiple models of the parameters
Steimle, Lauren N.
Ahluwalia, Vinayak S.
Kamdar, Charmee
Denton, Brian T.
[J]. IISE TRANSACTIONS, 2021, 53 (12) : 1295 - 1310
[33] On the Development of Voter Transition Models for Social Choice Markov Decision Processes
Garcia, David
Riedl, Anton
[J]. 2013 IEEE 25TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2013, : 811 - 817
[34] Extending Persistent Monitoring by Combining Ocean Models and Markov Decision Processes
Al-Sabban, Wesam H.
Gonzalez, Luis F.
Smith, Ryan N.
[J]. 2012 OCEANS, 2012,
[35] Preference Planning for Markov Decision Processes
Li, Meilun
She, Zhikun
Turrini, Andrea
Zhang, Lijun
[J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3313 - 3319
[36] On Markov policies for minimax decision processes
Iwamoto, S
Tsurusaki, K
[J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2001, 253 (01) : 58 - 78
[37] Active Exploration in Markov Decision Processes
Tarbouriech, Jean
Lazaric, Alessandro
[J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
[38] Markov Decision Processes with Applications to Finance
McAuliffe, Jon
[J]. QUANTITATIVE FINANCE, 2012, 12 (01) : 15 - 16
[39] Probabilistic Hyperproperties of Markov Decision Processes
Dimitrova, Rayna
Finkbeiner, Bernd
Torfah, Hazem
[J]. AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS (ATVA 2020), 2020, 12302 : 484 - 500
[40] Mean Field Markov Decision Processes
Baeuerle, Nicole
[J]. APPLIED MATHEMATICS AND OPTIMIZATION, 2023, 88 (01):

← 1 2 3 4 5 →