A Special Case of Partially Observable Markov Decision Processes Problem by Event-Based Optimization

被引：0

作者：

Zhang, Junyu ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China

来源：

PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT) | 2016年

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, we discuss a kind of partially observable Markov decision process (POMDP) problem by the event-based optimization which is proposed in [4]. A POMDP ([7] and [8]) is a generalization of a standard completely observable Markov decision process that allows imperfect information about states of the system. Policy iteration algorithms for POMDPs have proved to be impractical as it is very difficult to implement. Thus, most work with POMDPs has used value iteration. But for a special case of POMDP, we can formulate it to an MDP problem. Then we can use our sensitivity view to derive the corresponding average reward difference formula. Based on that and the idea of event-based optimization, we use a single sample path to estimate aggregated potentials. Then we develop policy iteration (PI) algorithms.

引用

页码：1522 / 1526

页数：5

共 50 条

[21] Partially observable Markov decision processes with imprecise parameters
Itoh, Hideaki
Nakamura, Kiyohiko
ARTIFICIAL INTELLIGENCE, 2007, 171 (8-9) : 453 - 490
[22] Nonapproximability results for partially observable Markov decision processes
Lusena, Cristopher
Goldsmith, Judy
Mundhenk, Martin
1600, Morgan Kaufmann Publishers (14):
[23] THE PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES FRAMEWORK IN MEDICAL DECISION MAKING
Goulionis, John E.
Stengos, Dimitrios I.
ADVANCES AND APPLICATIONS IN STATISTICS, 2008, 9 (02) : 205 - 232
[24] PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS
Goulionis, John
Stengos, D.
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2011, 10 (06) : 1175 - 1197
[25] An Argument for the Bayesian Control of Partially Observable Markov Decision Processes
Vargo, Erik
Cogill, Randy
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (10) : 2796 - 2800
[26] Partially observable Markov decision processes for spoken dialog systems
Williams, Jason D.
Young, Steve
COMPUTER SPEECH AND LANGUAGE, 2007, 21 (02): : 393 - 422
[27] Learning deterministic policies in partially observable Markov decision processes
Miyazaki, K
Kobayashi, S
INTELLIGENT AUTONOMOUS SYSTEMS: IAS-5, 1998, : 250 - 257
[28] Nonmyopic multiaspect sensing with partially observable Markov decision processes
Ji, Shihao
Parr, Ronald
Carin, Lawrence
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (06) : 2720 - 2730
[29] A Fast Approximation Method for Partially Observable Markov Decision Processes
Bingbing Liu
Yu Kang
Xiaofeng Jiang
Jiahu Qin
Journal of Systems Science and Complexity, 2018, 31 : 1423 - 1436
[30] Partially Observable Markov Decision Processes incorporating epistemic uncertainties
Faddoul, R.
Raphael, W.
Soubra, A. -H.
Chateauneuf, A.
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2015, 241 (02) : 391 - 401

← 1 2 3 4 5 →