A solving method of an MDP with a constraint by genetic algorithms

被引:4
|
作者
Hirayama, K [1 ]
Kawai, H
机构
[1] Tottori Univ, Course Engn Social Dev, Tottori 680, Japan
[2] Tottori Univ, Dept Social Syst Engn, Tottori 680, Japan
关键词
Markov decision processes; genetic algorithms; reward constraints; linear programming; pure and mixed policies;
D O I
10.1016/S0895-7177(00)00084-4
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We consider a discrete time Markov decision process (MDP) with a finite state space, a finite action space, and two kinds of immediate rewards. The problem is to maximize the time average reward generated by one reward stream, subject to the other reward not being smaller than a prescribed value. An MDP with a reward constraint can be solved by linear programming in the range of mixed policies. On the other hand, when we restrict ourselves to pure policies, the problem is a combinatorial problem, for which a solution has not been discovered. In this paper, we propose an approach by Genetic Algorithms (GAs) in order to obtain an effective search process and to obtain a near optimal, possibly optimal pure stationary policy. A numerical example is given to examine the efficiency of the approach proposed. (C) 2000 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:165 / 173
页数:9
相关论文
共 50 条
  • [11] Hybridization of estimation of distribution algorithms with a repair method for solving constraint satisfaction problems
    Handa, H
    GENETIC AND EVOLUTIONARY COMPUTATION - GECCO 2003, PT I, PROCEEDINGS, 2003, 2723 : 991 - 1002
  • [12] Inequality constraint handling in genetic algorithms using a boundary simulation method
    Li, Xiang
    Du, Gang
    COMPUTERS & OPERATIONS RESEARCH, 2012, 39 (03) : 521 - 540
  • [13] Fixed-point equations solving Risk-sensitive MDP with constraint
    Singh, Vartika
    Kavitha, Veeraruna
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 3409 - 3414
  • [14] Using constraint satisfaction in genetic algorithms
    Kowalczyk, R
    ANZIIS 96 - 1996 AUSTRALIAN NEW ZEALAND CONFERENCE ON INTELLIGENT INFORMATION SYSTEMS, PROCEEDINGS, 1996, : 272 - 275
  • [15] Method for solving fuzzy de novo programming problem by genetic algorithms
    Sasaki, Masato
    Gen, Mitsuo
    Yamashiro, Mitsuo
    Computers and Industrial Engineering, 1995, 29 (1-4): : 507 - 511
  • [16] Genetic algorithms optimization for normalized normal constraint method under Pareto construction
    Martinez, M.
    Garcia-Nieto, S.
    Sanchis, J.
    Blasco, X.
    ADVANCES IN ENGINEERING SOFTWARE, 2009, 40 (04) : 260 - 267
  • [17] Constraint handling in genetic algorithms using a gradient-based repair method
    Chootinan, P
    Chen, A
    COMPUTERS & OPERATIONS RESEARCH, 2006, 33 (08) : 2263 - 2281
  • [18] Modified backjumping algorithms for solving constraint satisfaction problems
    Chowdhury, U
    Gupta, DK
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 1999, 13 (01) : 133 - 147
  • [19] An overview of a constraint solving engine with multiple optimization algorithms
    Jolevski, I
    Loskovska, S
    Chorbev, I
    Mihajlov, D
    ITI 2005: Proceedings of the 27th International Conference on Information Technology Interfaces, 2005, : 637 - 643
  • [20] Genetic algorithms - Constraint logic programming. Hybrid method for shop scheduling
    Mesghouni, K
    Pesin, P
    Hammadi, S
    Tahon, C
    Borne, P
    RE-ENGINEERING FOR SUSTAINABLE INDUSTRIAL PRODUCTION, 1997, : 151 - 160