An online scalarization multi-objective reinforcement learning algorithm: TOPSIS Q-learning

被引：4

作者：

Mirzanejad, Mohammad ^{[1
]}

Ebrahimi, Morteza ^{[1
]}

Vamplew, Peter ^{[2
]}

Veisi, Hadi ^{[1
]}

机构：

[1] Univ Tehran, Fac New Sci & Technol, Tehran, Iran

[2] Federat Univ Australia, Sch Engn Informat Technol & Phys Sci, Ballarat, Vic, Australia

来源：

KNOWLEDGE ENGINEERING REVIEW | 2022年 / 37卷 / 04期

关键词：

Decision making - E-learning - Learning algorithms;

D O I：

10.1017/S0269888921000163

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Conventional reinforcement learning focuses on problems with single objective. However, many problems have multiple objectives or criteria that may be independent, related, or contradictory. In such cases, multi-objective reinforcement learning is used to propose a compromise among the solutions to balance the objectives. TOPSIS is a multi-criteria decision method that selects the alternative with minimum distance from the positive ideal solution and the maximum distance from the negative ideal solution, so it can be used effectively in the decision-making process to select the next action. In this research a single-policy algorithm called TOPSIS Q-Learning is provided with focus on its performance in online mode. Unlike all single-policy methods, in the first version of the algorithm, there is no need for the user to specify the weights of the objectives. The user's preferences may not be completely definite, so all weight preferences are combined together as decision criteria and a solution is generated by considering all these preferences at once and user can model the uncertainty and weight changes of objectives around their specified preferences of objectives. If the user only wants to apply the algorithm for a specific set of weights the second version of the algorithm efficiently accomplishes that.

引用

页数：29

共 50 条

[31] A Multi-Objective Virtual Network Migration Algorithm Based on Reinforcement Learning
Wang, Desheng
Zhang, Weizhe
Han, Xiao
Lin, Junren
Tian, Yu-Chu
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (02) : 2039 - 2056
[32] Multi-Objective Reinforcement Learning Algorithm and Its Improved Convergency Method
Zhao Jin
Zhang Huajun
2011 6TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2011, : 2438 - 2445
[33] Multi-objective vehicle following decision algorithm based on reinforcement learning
Dend X.-H.
Hou J.
Tan G.-H.
Wan B.-Y.
Cao T.-T.
Kongzhi yu Juece/Control and Decision, 2021, 36 (10): : 2497 - 2503
[34] Efficient Elitist Cooperative Evolutionary Algorithm for Multi-Objective Reinforcement Learning
Zhou, Dan
Du, Jiqing
Arai, Sachiyo
IEEE ACCESS, 2023, 11 (43128-43139) : 43128 - 43139
[35] Multi-Agent Reinforcement Learning - An Exploration Using Q-Learning
Graham, Caoimhin
Bell, David
Luo, Zhihui
RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXVI: INCORPORATING APPLICATIONS AND INNOVATIONS IN INTELLIGENT SYSTEMS XVII, 2010, : 293 - 298
[36] Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
Omura, Motoki
Osa, Takayuki
Mukuta, Yusuke
Harada, Tatsuya
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14474 - 14481
[37] Evaluating Q-Learning Policies for Multi-objective Foraging Task in a Multi-agent Environment
Yogeswaran, M.
Ponnambalam, S. G.
INTELLIGENT ROBOTICS AND APPLICATIONS, PT II, 2010, 6425 : 587 - 598
[38] Multi-objective Reinforcement Learning for Responsive Grids
Perez, Julien
Germain-Renaud, Cecile
Kegl, Balazs
Loomis, Charles
JOURNAL OF GRID COMPUTING, 2010, 8 (03) : 473 - 492
[39] Special issue on multi-objective reinforcement learning
Drugan, Madalina
Wiering, Marco
Vamplew, Peter
Chetty, Madhu
NEUROCOMPUTING, 2017, 263 : 1 - 2
[40] A multi-objective deep reinforcement learning framework
Thanh Thi Nguyen
Ngoc Duy Nguyen
Vamplew, Peter
Nahavandi, Saeid
Dazeley, Richard
Lim, Chee Peng
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 96

← 1 2 3 4 5 →