Model-Based Reinforcement Learning and Neural-Network-Based Policy Compression for Spacecraft Rendezvous on Resource-Constrained Embedded Systems

被引:5
|
作者
Yang, Zhibin [1 ]
Xing, Linquan [1 ]
Gu, Zonghua [2 ]
Xiao, Yingmin [1 ]
Zhou, Yong [1 ]
Huang, Zhiqiu [1 ]
Xue, Lei [3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Sch Comp Sci & Technol, Nanjing 210016, Peoples R China
[2] Umea Univ, Dept Appl Phys & Elect, S-90187 Umea, Sweden
[3] Shanghai Aerosp Elect Technol Inst, Shanghai 201100, Peoples R China
基金
中国国家自然科学基金;
关键词
Space vehicles; Artificial neural networks; Mathematical models; Vehicle dynamics; Reinforcement learning; Predictive models; Computational modeling; Formal verification; Markov decision process (MDP); model-based reinforcement learning; spacecraft rendezvous guidance; DYNAMICS;
D O I
10.1109/TII.2022.3192085
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Autonomous spacecraft rendezvous is very challenging in increasingly complex space missions. In this article, we present our approach model-based reinforcement learning for spacecraft rendezvous guidance (MBRL4SRG). We build a Markov decision process model based on the Clohessy-Wiltshire equation of spacecraft dynamics and use dynamic programming to solve it and generate the decision table as the optimal agent policy. Since the onboard computing system of spacecraft is resource constrained in terms of both memory size and processing speed, we train a neural network (NN) as a compact and efficient function approximation to the tabular representation of the decision table. The NN outputs are formally verified using the verification tool ReluVal, and the verification results show that the robustness of the NN is maintained. Experimental results indicate that MBRL4SRG achieves lower computational overhead than the conventional proportional-integral-derivative algorithm and has higher trustworthiness and better computational efficiency during training than the model-free reinforcement learning algorithms.
引用
收藏
页码:1107 / 1116
页数:10
相关论文
共 50 条
  • [1] A reinforcement learning framework for utility-based scheduling in resource-constrained systems
    Vengerov, David
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2009, 25 (07): : 728 - 736
  • [2] Online Constrained Model-based Reinforcement Learning
    van Niekerk, Benjamin
    Damianou, Andreas
    Rosman, Benjamin
    [J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
  • [3] A neural network based heuristic for resource-constrained project scheduling
    Shou, YY
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2005, PT 1, PROCEEDINGS, 2005, 3496 : 794 - 799
  • [4] Model-Based Compute Orchestration for Resource-Constrained Repeating Flows
    Irizarry, Nazario, Jr.
    [J]. 2017 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2017,
  • [5] Model-Based Reinforcement Learning Framework of Online Network Resource Allocation
    Bakhshi, Bahador
    Mangues-Bafalluy, Josep
    [J]. IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 4456 - 4461
  • [6] Context-Aware DSPLs: Model-Based Runtime Adaptation for Resource-Constrained Systems
    Saller, Karsten
    Lochau, Malte
    Reimund, Ingo
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL SOFTWARE PRODUCT LINE CONFERENCE CO-LOCATED WORKSHOPS (SPLC'13 WORKSHOPS), 2013, : 106 - 113
  • [7] Neural-Network-Based Pose Estimation During Noncooperative Spacecraft Rendezvous Using Point Cloud
    Zhang, Shaodong
    Hu, Weiduo
    Guo, Wulong
    Liu, Chang
    [J]. JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2023, 20 (08): : 462 - 472
  • [8] Efficient Neural Network Pruning Using Model-Based Reinforcement Learning
    Bencsik, Blanka
    Szemenyei, Marton
    [J]. 2022 INTERNATIONAL SYMPOSIUM ON MEASUREMENT AND CONTROL IN ROBOTICS (ISMCR), 2022, : 130 - 137
  • [9] Synthesizing Neural Network Controllers with Probabilistic Model-Based Reinforcement Learning
    Higuera, Juan Camilo Gamboa
    Meger, David
    Dudek, Gregory
    [J]. 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 2538 - 2544
  • [10] NNEP, design pattern for neural-network-based embedded systems
    Esmaeilzadeh, H.
    Jamali, M. R.
    Saeedi, P.
    Moghimi, A.
    Lucas, C.
    Fakhraie, S. M.
    [J]. MIXDES 2007: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON MIXED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS:, 2007, : 673 - 678