An Actor-Critic Reinforcement Learning Approach for Energy Harvesting Communications Systems

被引:7
|
作者
Masadeh, Ala'eddin [1 ]
Wang, Zhengdao [1 ]
Kamal, Ahmed E. [1 ]
机构
[1] Iowa State Univ ISU, Ames, IA 50011 USA
基金
美国国家科学基金会;
关键词
Energy harvesting; Markov decision process; actor-critic; reinforcement learning; neural networks;
D O I
10.1109/icccn.2019.8846912
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Energy harvesting communications systems are able to provide high quality communications services using green energy sources. This paper presents an autonomous energy harvesting communications system that is able to adapt to any environment, and optimize its behavior with experience to maximize the valuable received data. The considered system is a point-to-point energy harvesting communications system consisting of a source and a destination, and working in an unknown and uncertain environment. The source is an energy harvesting node capable of harvesting solar energy and storing it in a finite capacity battery. Energy can be harvested, stored, and used from continuous ranges of energy values. Channel gains can take any value within a continuous range. Since exact information about future channel gains and harvested energy is unavailable, an architecture based on actor-critic reinforcement learning is proposed to learn a close-to-optimal transmission power allocation policy. The actor uses a stochastic parameterized policy to select actions at states stochastically. The policy is modeled by a normal distribution with a parameterized mean and standard deviation. The actor uses policy gradient to optimize the policy's parameters. The critic uses a three layer neural network to approximate the action-value function, and to evaluate the optimized policy. Simulation results evaluate the proposed architecture for actor-critic learning, and shows its ability to improve its performance with experience.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
    Zanette, Andrea
    Wainwright, Martin J.
    Brunskill, Emma
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [42] Swarm Reinforcement Learning Method Based on an Actor-Critic Method
    Iima, Hitoshi
    Kuroe, Yasuaki
    SIMULATED EVOLUTION AND LEARNING, 2010, 6457 : 279 - 288
  • [43] Manipulator Motion Planning based on Actor-Critic Reinforcement Learning
    Li, Qiang
    Nie, Jun
    Wang, Haixia
    Lu, Xiao
    Song, Shibin
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 4248 - 4254
  • [44] Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space
    Fan, Zhou
    Su, Rui
    Zhang, Weinan
    Yu, Yong
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2279 - 2285
  • [45] Actor-critic reinforcement learning for the feedback control of a swinging chain
    Dengler, C.
    Lohmann, B.
    IFAC PAPERSONLINE, 2018, 51 (13): : 378 - 383
  • [46] Power Allocation in HetNets with Hybrid Energy Supply Using Actor-Critic Reinforcement Learning
    Wei, Yifei
    Zhang, Zhiqiang
    Yu, F. Richard
    Han, Zhu
    GLOBECOM 2017 - 2017 IEEE GLOBAL COMMUNICATIONS CONFERENCE, 2017,
  • [47] A Prioritized objective actor-critic method for deep reinforcement learning
    Ngoc Duy Nguyen
    Thanh Thi Nguyen
    Peter Vamplew
    Richard Dazeley
    Saeid Nahavandi
    Neural Computing and Applications, 2021, 33 : 10335 - 10349
  • [48] A Prioritized objective actor-critic method for deep reinforcement learning
    Nguyen, Ngoc Duy
    Nguyen, Thanh Thi
    Vamplew, Peter
    Dazeley, Richard
    Nahavandi, Saeid
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (16): : 10335 - 10349
  • [49] Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation
    Zhou, Ruida
    Liu, Tao
    Cheng, Min
    Kalathil, Dileep
    Kumar, P. R.
    Tian, Chao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [50] Evaluating Correctness of Reinforcement Learning based on Actor-Critic Algorithm
    Kim, Youngjae
    Hussain, Manzoor
    Suh, Jae-Won
    Hong, Jang-Eui
    2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 320 - 325