Model-Based Reinforcement Learning and Neural-Network-Based Policy Compression for Spacecraft Rendezvous on Resource-Constrained Embedded Systems

被引：5

作者：

Yang, Zhibin ^{[1
]}

Xing, Linquan ^{[1
]}

Gu, Zonghua ^{[2
]}

Xiao, Yingmin ^{[1
]}

Zhou, Yong ^{[1
]}

Huang, Zhiqiu ^{[1
]}

Xue, Lei ^{[3
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Sch Comp Sci & Technol, Nanjing 210016, Peoples R China

[2] Umea Univ, Dept Appl Phys & Elect, S-90187 Umea, Sweden

[3] Shanghai Aerosp Elect Technol Inst, Shanghai 201100, Peoples R China

来源：

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS | 2023年 / 19卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Space vehicles; Artificial neural networks; Mathematical models; Vehicle dynamics; Reinforcement learning; Predictive models; Computational modeling; Formal verification; Markov decision process (MDP); model-based reinforcement learning; spacecraft rendezvous guidance; DYNAMICS;

D O I：

10.1109/TII.2022.3192085

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Autonomous spacecraft rendezvous is very challenging in increasingly complex space missions. In this article, we present our approach model-based reinforcement learning for spacecraft rendezvous guidance (MBRL4SRG). We build a Markov decision process model based on the Clohessy-Wiltshire equation of spacecraft dynamics and use dynamic programming to solve it and generate the decision table as the optimal agent policy. Since the onboard computing system of spacecraft is resource constrained in terms of both memory size and processing speed, we train a neural network (NN) as a compact and efficient function approximation to the tabular representation of the decision table. The NN outputs are formally verified using the verification tool ReluVal, and the verification results show that the robustness of the NN is maintained. Experimental results indicate that MBRL4SRG achieves lower computational overhead than the conventional proportional-integral-derivative algorithm and has higher trustworthiness and better computational efficiency during training than the model-free reinforcement learning algorithms.

引用

页码：1107 / 1116

页数：10

共 50 条

[1] A reinforcement learning framework for utility-based scheduling in resource-constrained systems
Vengerov, David
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2009, 25 (07): : 728 - 736
[2] Online Constrained Model-based Reinforcement Learning
van Niekerk, Benjamin
Damianou, Andreas
Rosman, Benjamin
[J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
[3] A neural network based heuristic for resource-constrained project scheduling
Shou, YY
[J]. ADVANCES IN NEURAL NETWORKS - ISNN 2005, PT 1, PROCEEDINGS, 2005, 3496 : 794 - 799
[4] Model-Based Compute Orchestration for Resource-Constrained Repeating Flows
Irizarry, Nazario, Jr.
[J]. 2017 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2017,
[5] Model-Based Reinforcement Learning Framework of Online Network Resource Allocation
Bakhshi, Bahador
Mangues-Bafalluy, Josep
[J]. IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 4456 - 4461
[6] Context-Aware DSPLs: Model-Based Runtime Adaptation for Resource-Constrained Systems
Saller, Karsten
Lochau, Malte
Reimund, Ingo
[J]. PROCEEDINGS OF THE 17TH INTERNATIONAL SOFTWARE PRODUCT LINE CONFERENCE CO-LOCATED WORKSHOPS (SPLC'13 WORKSHOPS), 2013, : 106 - 113
[7] Neural-Network-Based Pose Estimation During Noncooperative Spacecraft Rendezvous Using Point Cloud
Zhang, Shaodong
Hu, Weiduo
Guo, Wulong
Liu, Chang
[J]. JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2023, 20 (08): : 462 - 472
[8] Efficient Neural Network Pruning Using Model-Based Reinforcement Learning
Bencsik, Blanka
Szemenyei, Marton
[J]. 2022 INTERNATIONAL SYMPOSIUM ON MEASUREMENT AND CONTROL IN ROBOTICS (ISMCR), 2022, : 130 - 137
[9] Synthesizing Neural Network Controllers with Probabilistic Model-Based Reinforcement Learning
Higuera, Juan Camilo Gamboa
Meger, David
Dudek, Gregory
[J]. 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 2538 - 2544
[10] NNEP, design pattern for neural-network-based embedded systems
Esmaeilzadeh, H.
Jamali, M. R.
Saeedi, P.
Moghimi, A.
Lucas, C.
Fakhraie, S. M.
[J]. MIXDES 2007: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON MIXED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS:, 2007, : 673 - 678

← 1 2 3 4 5 →