A prescriptive Dirichlet power allocation policy with deep reinforcement learning

被引:12
|
作者
Tian, Yuan [1 ]
Han, Minghao [2 ]
Kulkarni, Chetan [3 ]
Fink, Olga [4 ]
机构
[1] Swiss Fed Inst Technol, Chair Intelligent Maintenance Syst, Zurich, Switzerland
[2] Harbin Inst Technol, Dept Control Sci & Engn, Harbin, Peoples R China
[3] NASA, Ames Res Ctr, Washington, DC 20546 USA
[4] Ecole Polytech Fed Lausanne, Lab Intelligent Maintenance & Operat Syst, Lausanne, Switzerland
基金
瑞士国家科学基金会;
关键词
Reinforcement learning; Deep learning; Prescriptive operation; Multi-power source systems; RESOURCES ALLOCATION; MANAGEMENT; STRATEGY; SYSTEM; MODEL;
D O I
10.1016/j.ress.2022.108529
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Prescribing optimal operation based on the condition of the system, and thereby potentially prolonging its remaining useful lifetime, has tremendous potential in terms of actively managing the availability, maintenance, and costs of complex systems. Reinforcement learning (RL) algorithms are particularly suitable for this type of problem given their learning capabilities. A special case of a prescriptive operation is the power allocation task, which can be considered as a sequential allocation problem whereby the action space is bounded by a simplex constraint. A general continuous action-space solution of such sequential allocation problems has still remained an open research question for RL algorithms. In continuous action space, the standard Gaussian policy applied in reinforcement learning does not support simplex constraints, while the Gaussian-softmax policy introduces a bias during training. In this work, we propose the Dirichlet policy for continuous allocation tasks and analyze the bias and variance of its policy gradients. We demonstrate that the Dirichlet policy is bias-free and provides significantly faster convergence, better performance, and better robustness to hyperparameter changes as compared to the Gaussian-softmax policy. Moreover, we demonstrate the applicability of the proposed algorithm on a prescriptive operation case in which we propose the Dirichlet power allocation policy and evaluate its performance on a case study of a set of multiple lithium-ion (Li-I) battery systems. The experimental results demonstrate the potential to prescribe optimal operation, improving the efficiency and sustainability of multi-power source systems.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Deep Reinforcement Learning for Trajectory Design and Power Allocation in UAV Networks
    Zhao, Nan
    Cheng, Yiqiang
    Pei, Yiyang
    Liang, Ying-Chang
    Niyato, Dusit
    ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2020,
  • [2] Dynamic User Pairing and Power Allocation for NOMA with Deep Reinforcement Learning
    Jiang, Fan
    Gu, Zesheng
    Sun, Changyin
    Ma, Rongxin
    2021 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2021,
  • [3] Optimal Power Allocation for Rate Splitting Communications With Deep Reinforcement Learning
    Hieu, Nguyen Quang
    Hoang, Dinh Thai
    Niyato, Dusit
    Kim, Dong In
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2021, 10 (12) : 2820 - 2823
  • [4] Research on power allocation of integrated VLPC based on deep reinforcement learning
    Ma, Shuai
    Li, Bing
    Sheng, Haihong
    Gu, Rongyan
    Zhou, Hui
    Wang, Hongmei
    Wang, Yue
    Li, Shiyin
    Tongxin Xuebao/Journal on Communications, 2022, 43 (08): : 121 - 130
  • [5] Joint Power Allocation and Channel Assignment for NOMA With Deep Reinforcement Learning
    He, Chaofan
    Hu, Yang
    Chen, Yan
    Zeng, Bing
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2019, 37 (10) : 2200 - 2210
  • [6] Deep Reinforcement Learning Based Power Allocation for High Throughput Satellites
    Dai, Nuoyi
    Zhou, Di
    Sheng, Min
    Li, Jiandong
    2021 IEEE 94TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2021-FALL), 2021,
  • [7] Prescriptive Maintenance of Freight Vehicles using Deep Reinforcement Learning
    Tham, Chen-Khong
    Liu, Weihao
    Chattopadhyay, Rajarshi
    2023 IEEE 97TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-SPRING, 2023,
  • [8] A Deep Reinforcement Learning-Based Whittle Index Policy for Multibeam Allocation
    Hao, Yuhang
    Fu, Jing
    Wang, Zengfu
    Pan, Quan
    2024 27TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION, FUSION 2024, 2024,
  • [9] Deep Reinforcement Learning-based Power Control and Bandwidth Allocation Policy for Weighted Cost Minimization in Wireless Networks
    Ke, Hongchang
    Wang, Hui
    Sun, Hongbin
    APPLIED INTELLIGENCE, 2023, 53 (22) : 26885 - 26906
  • [10] Deep Reinforcement Learning-based Power Control and Bandwidth Allocation Policy for Weighted Cost Minimization in Wireless Networks
    Hongchang Ke
    Hui Wang
    Hongbin Sun
    Applied Intelligence, 2023, 53 : 26885 - 26906