PMDRL: Pareto-front-based multi-objective deep reinforcement learning

被引:0
|
作者
Yang F. [1 ]
Huang H. [1 ]
Shi W. [1 ]
Ma Y. [1 ]
Feng Y. [1 ]
Cheng G. [1 ]
Liu Z. [1 ]
机构
[1] College of Systems Engineering, National University of Defense Technology, Changsha
关键词
Deep Q network; Grid world; Multi-objective reinforcement learning; Pareto optimality theory;
D O I
10.1007/s12652-022-04232-x
中图分类号
学科分类号
摘要
Most reinforcement learning research aims to optimize agents’ policies for a single objective. However, many real-world applications are inherently characterized by the presence of multiple, possibly conflicting, objectives. As a generalization of standard reinforcement learning approaches, multi-objective reinforcement learning addresses the demand for trade-offs between competing objectives. Instead of using single policy techniques, which involve various pieces of heuristic information such as reward shaping, we propose a novel reinforcement learning method that learns a policy without preference. We argue for the combination of Pareto Optimality theory and the deep Q network as a powerful tool to avoid constructing a synthetic reward function. This method is applied to reach a non-dominated sorting, defined as the Pareto front set, computed simultaneously without assuming any other weighted function or a linear procedure to select an action. We provide theoretical guarantees of our proposed method in the Grid World experiment. Experiments on multi-objective Cartpole demonstrate that our approach exhibits better performance, quick convergence, relatively good stability, and more diverse solutions than the traditional multi-objective deep Q network. © 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
引用
收藏
页码:12663 / 12672
页数:9
相关论文
共 50 条
  • [1] Scalable Pareto Front Approximation for Deep Multi-Objective Learning
    Ruchte, Michael
    Grabocka, Josif
    [J]. 2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 1306 - 1311
  • [2] A survey on pareto front learning for multi-objective optimization
    Kang, Shida
    Li, Kaiwen
    Wang, Rui
    [J]. JOURNAL OF MEMBRANE COMPUTING, 2024,
  • [3] Multi-objective path planning based on deep reinforcement learning
    Xu, Jian
    Huang, Fei
    Cui, Yunfei
    Du, Xue
    [J]. 2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 3273 - 3279
  • [4] Distributional Pareto-Optimal Multi-Objective Reinforcement Learning
    Cai, Xin-Qiang
    Zhang, Pushi
    Zhao, Li
    Bian, Jiang
    Sugiyama, Masashi
    Llorens, Ashley J.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Multi-Objective Reinforcement Learning with Continuous Pareto Frontier Approximation
    Pirotta, Matteo
    Parisi, Simone
    Restelli, Marcello
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2928 - 2934
  • [6] On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts
    Vamplew, Peter
    Yearwood, John
    Dazeley, Richard
    Berry, Adam
    [J]. AI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5360 : 372 - 378
  • [7] A multi-objective deep reinforcement learning framework
    Thanh Thi Nguyen
    Ngoc Duy Nguyen
    Vamplew, Peter
    Nahavandi, Saeid
    Dazeley, Richard
    Lim, Chee Peng
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 96
  • [8] Exploring the Pareto front of multi-objective COVID-19 mitigation policies using reinforcement learning
    Reymond, Mathieu
    Hayes, Conor F.
    Willem, Lander
    Radulescu, Roxana
    Abrams, Steven
    Roijers, Diederik M.
    Howley, Enda
    Mannion, Patrick
    Hens, Niel
    Nowe, Ann
    Libin, Pieter
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [9] Maximum Norm Minimization: A Single-Policy Multi-Objective Reinforcement Learning to Expansion of the Pareto Front
    Lee, Seonjae
    Lee, Myoung Hoon
    Moon, Jun
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 1064 - 1073
  • [10] Distributed Solution for Pareto Front Based Multi-objective OPF
    Zeng, Cong
    Zhu, Jizhong
    [J]. 2023 6TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND GREEN ENERGY, CEEGE, 2023, : 136 - 141