Deep reinforcement learning-based beam Hopping algorithm in multibeam satellite systems

被引:40
|
作者
Hu, Xin [1 ]
Liu, Shuaijun [2 ]
Wang, Yipeng [1 ]
Xu, Lexi [3 ]
Zhang, Yuchen [1 ]
Wang, Cheng [1 ]
Wang, Weidong [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Elect Engn, Beijing 100876, Peoples R China
[2] Chinese Acad Sci, Inst Software, Beijing 100190, Peoples R China
[3] China United Network Commun Corp, Network Technol Res Inst, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Markov processes; learning (artificial intelligence); optimisation; antenna radiation patterns; satellite antennas; telecommunication computing; telecommunication traffic; radiofrequency interference; multibeam satellite systems; transmission delay; long-term resource utilisation; BH illumination plan optimisation problem; partially observable Markov decision process; BHIP problem; DRL-BHIP algorithm; antenna radiation pattern; deep reinforcement learning-based beam Hopping algorithm; interbeam interference; ModCod constraints; traffic spatial feature; traffic temporal feature; artificial intelligence method; RESOURCE-ALLOCATION; MANAGEMENT; POWER;
D O I
10.1049/iet-com.2018.5774
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Beam hopping (BH) is the key technology to improve the system throughput and decrease the transmission delay in multibeam satellite systems. The objective of this study is to find a policy to maximise the expected long-term resource utilisation. The BH illumination plan (BHIP) optimisation problem aimed at minimising the transmission delay is formulated and modelled as a partially observable Markov decision process. To tackle the issue of unknown dynamics and prohibitive computation, an artificial intelligence method named deep reinforcement learning (DRL) is first proposed to solve the BHIP problem in multibeam satellite systems. The proposed DRL-BHIP algorithm considers a series of realistic conditions, including the traffic demands in spatial distribution and temporal variation, ModCod constraints, antenna radiation pattern and inter-beam interference. The state reformulation concept is adopted to characterise the traffic spatial and temporal features. Simulation results show that the proposed DRL-BHIP algorithm can decrease the transmission delay and improve the system throughput compared with existing algorithms.
引用
收藏
页码:2485 / 2491
页数:7
相关论文
共 50 条
  • [1] A Deep Reinforcement Learning-Based Framework for Dynamic Resource Allocation in Multibeam Satellite Systems
    Hu, Xin
    Liu, Shuaijun
    Chen, Rong
    Wang, Weidong
    Wang, Chunting
    [J]. IEEE COMMUNICATIONS LETTERS, 2018, 22 (08) : 1612 - 1615
  • [2] Deep Reinforcement Learning Based Dynamic Channel Allocation Algorithm in Multibeam Satellite Systems
    Liu, Shuaijun
    Hu, Xin
    Wang, Weidong
    [J]. IEEE ACCESS, 2018, 6 : 15733 - 15742
  • [3] An online power allocation algorithm based on deep reinforcement learning in multibeam satellite systems
    Zhang, Pei
    Wang, Xiaohui
    Ma, Zhiguo
    Liu, Shuaijun
    Song, Junde
    [J]. INTERNATIONAL JOURNAL OF SATELLITE COMMUNICATIONS AND NETWORKING, 2020, 38 (05) : 450 - 461
  • [4] Deep Reinforcement Learning Based Interference Avoidance Beam-Hopping Allocation Algorithm in Multi-beam Satellite Systems
    Wang, Haonan
    Liu, Lixiang
    Zhou, Xin
    Xu, Lexi
    Wu, Guangyang
    Liu, Shuaijun
    [J]. 2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 1966 - 1973
  • [5] Beam Hopping Scheduling Based on Deep Reinforcement Learning
    Deng, Huimin
    Ying, Kai
    Gui, Lin
    [J]. 2023 INTERNATIONAL CONFERENCE ON FUTURE COMMUNICATIONS AND NETWORKS, FCN, 2023,
  • [6] Dynamic Beam Pattern and Bandwidth Allocation Based on Multi-Agent Deep Reinforcement Learning for Beam Hopping Satellite Systems
    Lin, Zhiyuan
    Ni, Zuyao
    Kuang, Linling
    Jiang, Chunxiao
    Huang, Zhen
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (04) : 3917 - 3930
  • [7] Dynamic Beam Hopping Method Based on Multi-Objective Deep Reinforcement Learning for Next Generation Satellite Broadband Systems
    Hu, Xin
    Zhang, Yuchen
    Liao, Xianglai
    Liu, Zhijun
    Wang, Weidong
    Ghannouchi, Fadhel M.
    [J]. IEEE TRANSACTIONS ON BROADCASTING, 2020, 66 (03) : 630 - 646
  • [8] Deep reinforcement learning-based antilock braking algorithm
    Mantripragada, V. Krishna Teja
    Kumar, R. Krishna
    [J]. VEHICLE SYSTEM DYNAMICS, 2023, 61 (05) : 1410 - 1431
  • [9] Joint beam hopping and coverage control optimization algorithm for multibeam satellite system
    Xu, Guoliang
    Tan, Feng
    Ran, Yongyi
    Chen, Feng
    [J]. Tongxin Xuebao/Journal on Communications, 2023, 44 (04): : 78 - 86
  • [10] Cluster-Based Beam Hopping for Energy Efficiency Maximization in Flexible Multibeam Satellite Systems
    Yang, Haowen
    Yang, Dewei
    Li, Yuanjun
    Kuang, Jingming
    [J]. IEEE COMMUNICATIONS LETTERS, 2023, 27 (12) : 3300 - 3304