Multi-objective fuzzy Q-learning to solve continuous state-action problems

被引：4

作者：

Asgharnia, Amirhossein ^{[1
]}

Schwartz, Howard ^{[1
]}

Atia, Mohamed ^{[1
]}

机构：

[1] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON, Canada

来源：

NEUROCOMPUTING | 2023年 / 516卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Reinforcement learning; Differential games; Q-learning; Multi-objective reinforcement learning;

D O I：

10.1016/j.neucom.2022.10.035

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many real world problems are multi-objective. Thus, the need for multi-objective learning and optimiza-tion algorithms is inevitable. Although the multi-objective optimization algorithms are well-studied, the multi-objective learning algorithms have attracted less attention. In this paper, a fuzzy multi-objective reinforcement learning algorithm is proposed, and we refer to it as the multi-objective fuzzy Q-learning (MOFQL) algorithm. The algorithm is implemented to solve a bi-objective reach-avoid game. The majority of the multi-objective reinforcement algorithms proposed address solving problems in the discrete state-action domain. However, the MOFQL algorithm can also handle problems in a contin-uous state-action domain. A fuzzy inference system (FIS) is implemented to estimate the value function for the bi-objective problem. We used a temporal difference (TD) approach to update the fuzzy rules. The proposed method isa multi-policy multi-objective algorithm and can find the non-convex regions of the Pareto front.(c) 2022 Elsevier B.V. All rights reserved.

引用

页码：115 / 132

页数：18

共 50 条

[1] Fuzzy Q-learning in continuous state and action space
Xu M.-L.
Xu W.-B.
Journal of China Universities of Posts and Telecommunications, 2010, 17 (04): : 100 - 109
[2] Fuzzy Q-learning in continuous state and action space
XU Ming-liang1
The Journal of China Universities of Posts and Telecommunications, 2010, 17 (04) : 100 - 109
[3] Scaling Up Q-Learning via Exploiting State-Action Equivalence
Lyu, Yunlian
Come, Aymeric
Zhang, Yijie
Talebi, Mohammad Sadegh
ENTROPY, 2023, 25 (04)
[4] Q-learning in continuous state and action spaces
Gaskett, C
Wettergreen, D
Zelinsky, A
ADVANCED TOPICS IN ARTIFICIAL INTELLIGENCE, 1999, 1747 : 417 - 428
[5] Q-learning in Continuous State-Action Space with Redundant Dimensions by Using a Selective Desensitization Neural Network
Kobayashi, Takaaki
Shibuya, Takeshi
Morita, Masahiko
2014 JOINT 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 15TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2014, : 801 - 806
[6] Q-Learning in Continuous State-Action Space with Noisy and Redundant Inputs by Using a Selective Desensitization Neural Network
Kobayashi, Takaaki
Shibuya, Takeshi
Morita, Masahiko
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2015, 19 (06) : 825 - 832
[7] Reinforcement distribution in continuous state action space fuzzy Q-learning: A novel approach
Bonarini, A
Montrone, F
Restelli, M
FUZZY LOGIC AND APPLICATIONS, 2006, 3849 : 40 - 45
[8] A Novel Multi-Objective Deep Q-Network: Addressing Immediate and Delayed Rewards in Multi-Objective Q-Learning
Zhang, Youming
IEEE Access, 2024, 12 : 144932 - 144949
[9] Swarm Reinforcement Learning Methods for Problems with Continuous State-Action Space
Iima, Hitoshi
Kuroe, Yasuaki
Emoto, Kazuo
2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 2173 - 2180
[10] Decomposed Multi-objective Method Based on Q-Learning for Solving Multi-objective Combinatorial Optimization Problem
Yang, Anju
Liu, Yuan
Zou, Juan
Yang, Shengxiang
BIO-INSPIRED COMPUTING: THEORIES AND APPLICATIONS, PT 1, BIC-TA 2023, 2024, 2061 : 59 - 73

← 1 2 3 4 5 →