Multi-objective fuzzy Q-learning to solve continuous state-action problems

被引:4
|
作者
Asgharnia, Amirhossein [1 ]
Schwartz, Howard [1 ]
Atia, Mohamed [1 ]
机构
[1] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Reinforcement learning; Differential games; Q-learning; Multi-objective reinforcement learning;
D O I
10.1016/j.neucom.2022.10.035
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many real world problems are multi-objective. Thus, the need for multi-objective learning and optimiza-tion algorithms is inevitable. Although the multi-objective optimization algorithms are well-studied, the multi-objective learning algorithms have attracted less attention. In this paper, a fuzzy multi-objective reinforcement learning algorithm is proposed, and we refer to it as the multi-objective fuzzy Q-learning (MOFQL) algorithm. The algorithm is implemented to solve a bi-objective reach-avoid game. The majority of the multi-objective reinforcement algorithms proposed address solving problems in the discrete state-action domain. However, the MOFQL algorithm can also handle problems in a contin-uous state-action domain. A fuzzy inference system (FIS) is implemented to estimate the value function for the bi-objective problem. We used a temporal difference (TD) approach to update the fuzzy rules. The proposed method isa multi-policy multi-objective algorithm and can find the non-convex regions of the Pareto front.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:115 / 132
页数:18
相关论文
共 50 条
  • [41] Multimodal transportation routing optimization based on multi-objective Q-learning under time uncertainty
    Tie Zhang
    Jia Cheng
    Yanbiao Zou
    Complex & Intelligent Systems, 2024, 10 : 3133 - 3152
  • [42] Multi-objective virtual network embedding algorithm based on Q-learning and curiosity-driven
    Mengyang He
    Lei Zhuang
    Shuaikui Tian
    Guoqing Wang
    Kunli Zhang
    EURASIP Journal on Wireless Communications and Networking, 2018
  • [43] A decomposition-based multi-objective evolutionary algorithm with Q-learning for adaptive operator selection
    Xue, Fei
    Chen, Yuezheng
    Wang, Peiwen
    Ye, Yunsen
    Dong, Jinda
    Dong, Tingting
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (14): : 21229 - 21283
  • [44] Q-Learning Based Multi-Objective Clustering Algorithm for Cognitive Radio Ad Hoc Networks
    Hossen, Md Arman
    Yoo, Sang-Jo
    IEEE ACCESS, 2019, 7 : 181959 - 181971
  • [45] Policy Sharing Using Aggregation Trees for Q-Learning in a Continuous State and Action Spaces
    Chen, Yu-Jen
    Jiang, Wei-Cheng
    Ju, Ming-Yi
    Hwang, Kao-Shing
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2020, 12 (03) : 474 - 485
  • [46] Expression of Continuous State and Action Spaces for Q-Learning Using Neural Networks and CMAC
    Yamada, Kazuaki
    JOURNAL OF ROBOTICS AND MECHATRONICS, 2012, 24 (02) : 330 - 339
  • [47] A particular simplex algorithm to solve fuzzy lexicographic multi-objective linear programming problems and their sensitivity analysis on the priority of the fuzzy objective functions
    Ezzati, Reza
    Khorram, Esmaile
    Enayati, Ramin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 26 (05) : 2333 - 2358
  • [48] A New Method to Solve Multi-Objective Linear Fractional Problems
    Borza, Mojtaba
    Rambely, Azmin Sham
    FUZZY INFORMATION AND ENGINEERING, 2021, 13 (03) : 323 - 334
  • [49] Alternative Techniques to Solve Hard Multi-Objective Optimization Problems
    Landa Becerra, Ricardo
    Coello Coello, Carlos A.
    Hernandez-Diaz, Alfredo G.
    Caballero, Rafael
    Molina, Julian
    GECCO 2007: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOL 1 AND 2, 2007, : 757 - +
  • [50] Application of multi-objective particle swarm optimization to solve a fuzzy multi-objective reliability redundancy allocation problem
    Ebrahimipour, V.
    Sheikhalishahi, M.
    2011 IEEE INTERNATIONAL SYSTEMS CONFERENCE (SYSCON 2011), 2011, : 326 - 333