An online scalarization multi-objective reinforcement learning algorithm: TOPSIS Q-learning

被引:4
|
作者
Mirzanejad, Mohammad [1 ]
Ebrahimi, Morteza [1 ]
Vamplew, Peter [2 ]
Veisi, Hadi [1 ]
机构
[1] Univ Tehran, Fac New Sci & Technol, Tehran, Iran
[2] Federat Univ Australia, Sch Engn Informat Technol & Phys Sci, Ballarat, Vic, Australia
来源
KNOWLEDGE ENGINEERING REVIEW | 2022年 / 37卷 / 04期
关键词
Decision making - E-learning - Learning algorithms;
D O I
10.1017/S0269888921000163
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Conventional reinforcement learning focuses on problems with single objective. However, many problems have multiple objectives or criteria that may be independent, related, or contradictory. In such cases, multi-objective reinforcement learning is used to propose a compromise among the solutions to balance the objectives. TOPSIS is a multi-criteria decision method that selects the alternative with minimum distance from the positive ideal solution and the maximum distance from the negative ideal solution, so it can be used effectively in the decision-making process to select the next action. In this research a single-policy algorithm called TOPSIS Q-Learning is provided with focus on its performance in online mode. Unlike all single-policy methods, in the first version of the algorithm, there is no need for the user to specify the weights of the objectives. The user's preferences may not be completely definite, so all weight preferences are combined together as decision criteria and a solution is generated by considering all these preferences at once and user can model the uncertainty and weight changes of objectives around their specified preferences of objectives. If the user only wants to apply the algorithm for a specific set of weights the second version of the algorithm efficiently accomplishes that.
引用
收藏
页数:29
相关论文
共 50 条
  • [41] Multi-objective Reinforcement Learning for Responsive Grids
    Julien Perez
    Cécile Germain-Renaud
    Balazs Kégl
    Charles Loomis
    Journal of Grid Computing, 2010, 8 : 473 - 492
  • [42] Pedestrian simulation as multi-objective reinforcement learning
    Ravichandran, Naresh Balaji
    Yang, Fangkai
    Peters, Christopher
    Lansner, Anders
    Herman, Pawel
    18TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (IVA'18), 2018, : 307 - 312
  • [43] Continuous reinforcement learning to adapt multi-objective optimization online for robot motion
    Zhang, Kai
    McLeod, Sterling
    Lee, Minwoo
    Xiao, Jing
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2020, 17 (02)
  • [44] Fuzzy Q-Learning for generalization of reinforcement learning
    Berenji, HR
    FUZZ-IEEE '96 - PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 1996, : 2208 - 2214
  • [45] Multi-objective fuzzy Q-learning to solve continuous state-action problems
    Asgharnia, Amirhossein
    Schwartz, Howard
    Atia, Mohamed
    NEUROCOMPUTING, 2023, 516 : 115 - 132
  • [46] Q-Learning Based Multi-objective Optimization Routing Strategy in UAVs Deterministic Network
    Zhou, Zou
    Chen, Longjie
    Hu, Yu
    Zheng, Fei
    Liang, Caisheng
    Li, Kelin
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 399 - 408
  • [47] Multi-objective optimization of radiotherapy: distributed Q-learning and agent-based simulation
    Jalalimanesh, Ammar
    Haghighi, Hamidreza Shahabi
    Ahmadi, Abbas
    Hejazian, Hossein
    Soltani, Madjid
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2017, 29 (05) : 1071 - 1086
  • [48] Deep Reinforcement Learning with Double Q-Learning
    van Hasselt, Hado
    Guez, Arthur
    Silver, David
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2094 - 2100
  • [49] Reinforcement learning guidance law of Q-learning
    Zhang Q.
    Ao B.
    Zhang Q.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2020, 42 (02): : 414 - 419
  • [50] Backward Q-learning: The combination of Sarsa algorithm and Q-learning
    Wang, Yin-Hao
    Li, Tzuu-Hseng S.
    Lin, Chih-Jui
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) : 2184 - 2193