An online scalarization multi-objective reinforcement learning algorithm: TOPSIS Q-learning

被引：4

作者：

Mirzanejad, Mohammad ^{[1
]}

Ebrahimi, Morteza ^{[1
]}

Vamplew, Peter ^{[2
]}

Veisi, Hadi ^{[1
]}

机构：

[1] Univ Tehran, Fac New Sci & Technol, Tehran, Iran

[2] Federat Univ Australia, Sch Engn Informat Technol & Phys Sci, Ballarat, Vic, Australia

来源：

KNOWLEDGE ENGINEERING REVIEW | 2022年 / 37卷 / 04期

关键词：

Decision making - E-learning - Learning algorithms;

D O I：

10.1017/S0269888921000163

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Conventional reinforcement learning focuses on problems with single objective. However, many problems have multiple objectives or criteria that may be independent, related, or contradictory. In such cases, multi-objective reinforcement learning is used to propose a compromise among the solutions to balance the objectives. TOPSIS is a multi-criteria decision method that selects the alternative with minimum distance from the positive ideal solution and the maximum distance from the negative ideal solution, so it can be used effectively in the decision-making process to select the next action. In this research a single-policy algorithm called TOPSIS Q-Learning is provided with focus on its performance in online mode. Unlike all single-policy methods, in the first version of the algorithm, there is no need for the user to specify the weights of the objectives. The user's preferences may not be completely definite, so all weight preferences are combined together as decision criteria and a solution is generated by considering all these preferences at once and user can model the uncertainty and weight changes of objectives around their specified preferences of objectives. If the user only wants to apply the algorithm for a specific set of weights the second version of the algorithm efficiently accomplishes that.

引用

页数：29

共 50 条

[1] Accelerated multi-objective task learning using modified Q-learning algorithm
Rajamohan, Varun Prakash
Jagatheesaperumal, Senthil Kumar
INTERNATIONAL JOURNAL OF AD HOC AND UBIQUITOUS COMPUTING, 2024, 47 (01) : 28 - 37
[2] Multi-objective route recommendation method based on Q-learning algorithm
Yu, Qingying
Xiao, Zhenxing
Yang, Feng
Gong, Shan
Shi, Gege
Chen, Chuanming
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (04) : 7009 - 7025
[3] Cognitive networks QoS multi-objective strategy based on Q-learning algorithm
Wang, B. (wangbowx@163.com), 1600, Advanced Institute of Convergence Information Technology, Myoungbo Bldg 3F,, Bumin-dong 1-ga, Seo-gu, Busan, 602-816, Korea, Republic of (07):
[4] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
Horie, Naoto
Matsui, Tohgoroh
Moriyama, Koichi
Mutoh, Atsuko
Inuzuka, Nobuhiro
ARTIFICIAL LIFE AND ROBOTICS, 2019, 24 (03) : 352 - 359
[5] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
Naoto Horie
Tohgoroh Matsui
Koichi Moriyama
Atsuko Mutoh
Nobuhiro Inuzuka
Artificial Life and Robotics, 2019, 24 : 352 - 359
[6] A Multi-objective Reinforcement Learning Algorithm for JS']JSSP
Mendez-Hernandez, Beatriz M.
Rodriguez-Bazan, Erick D.
Martinez-Jimenez, Yailen
Libin, Pieter
Nowe, Ann
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: THEORETICAL NEURAL COMPUTATION, PT I, 2019, 11727 : 567 - 584
[7] Decomposition based Multi-Objective Evolutionary Algorithm in XCS for Multi-Objective Reinforcement Learning
Cheng, Xiu
Browne, Will N.
Zhang, Mengjie
2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 622 - 629
[8] A Novel Multi-Objective Deep Q-Network: Addressing Immediate and Delayed Rewards in Multi-Objective Q-Learning
Zhang, Youming
IEEE Access, 2024, 12 : 144932 - 144949
[9] Multi-objective virtual network embedding algorithm based on Q-learning and curiosity-driven
He, Mengyang
Zhuang, Lei
Tian, Shuaikui
Wang, Guoqing
Zhang, Kunli
EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2018,
[10] Decomposed Multi-objective Method Based on Q-Learning for Solving Multi-objective Combinatorial Optimization Problem
Yang, Anju
Liu, Yuan
Zou, Juan
Yang, Shengxiang
BIO-INSPIRED COMPUTING: THEORIES AND APPLICATIONS, PT 1, BIC-TA 2023, 2024, 2061 : 59 - 73

← 1 2 3 4 5 →