Efficient Penetration Testing Path Planning Based on Reinforcement Learning with Episodic Memory

被引：0

作者：

Zhou, Ziqiao ^{[1
]}

Zhou, Tianyang ^{[1
]}

Xu, Jinghao ^{[2
]}

Zhu, Junhu ^{[1
]}

机构：

[1] Natl Engn Technol Res Ctr Digital Switching Syst, Henan Key Lab Informat Secur, Zhengzhou 450000, Peoples R China

[2] Informat Engn Univ, Sch Cryptog Engn, Zhengzhou 450000, Peoples R China

来源：

CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES | 2024年 / 140卷 / 03期

关键词：

Intelligent penetration testing; penetration testing path planning; reinforcement learning; episodic memory; exploration strategy;

D O I：

10.32604/cmes.2023.028553

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Intelligent penetration testing is of great significance for the improvement of the security of information systems, and the critical issue is the planning of penetration test paths. In view of the difficulty for attackers to obtain complete network information in realistic network scenarios, Reinforcement Learning (RL) is a promising solution to discover the optimal penetration path under incomplete information about the target network. Existing RLbased methods are challenged by the sizeable discrete action space, which leads to difficulties in the convergence. Moreover, most methods still rely on experts' knowledge. To address these issues, this paper proposes a penetration path planning method based on reinforcement learning with episodic memory. First, the penetration testing problem is formally described in terms of reinforcement learning. To speed up the training process without specific prior knowledge, the proposed algorithm introduces episodic memory to store experienced advantageous strategies for the first time. Furthermore, the method offers an exploration strategy based on episodic memory to guide the agents in learning. The design makes full use of historical experience to achieve the purpose of reducing blind exploration and improving planning efficiency. Ultimately, comparison experiments are carried out with the existing RL-based methods. The results reveal that the proposed method has better convergence performance. The running time is reduced by more than 20%.

引用

页码：2613 / 2634

页数：22

共 50 条

[41] Path planning for a robot manipulator based on probabilistic roadmap and reinforcement learning
Park, Jung-Jun
Kim, Ji-Hun
Song, Jae-Bok
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2007, 5 (06) : 674 - 680
[42] AUV path planning based on improved IFDS and deep reinforcement learning
Fan, Yiqun
Li, Hongna
Xie, Jiaqi
Zhou, Yunfu
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2024, 21 (06):
[43] Multi-objective path planning based on deep reinforcement learning
Xu, Jian
Huang, Fei
Cui, Yunfei
Du, Xue
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 3273 - 3279
[44] Research on path planning algorithm of mobile robot based on reinforcement learning
Guoqian Pan
Yong Xiang
Xiaorui Wang
Zhongquan Yu
Xinzhi Zhou
Soft Computing, 2022, 26 : 8961 - 8970
[45] Ship path planning based on Deep Reinforcement Learning and weather forecast
Artusi, Eva
2021 22ND IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2021), 2021, : 258 - 260
[46] Robot Patrol Path Planning Based on Combined Deep Reinforcement Learning
Li, Wenqi
Chen, Dehua
Le, Jiajin
2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 659 - 666
[47] An adaptive gain parameters algorithm for path planning based on reinforcement learning
Yu, JL
Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 3557 - 3562
[48] A Reinforcement Learning-Based Path Planning Considering Degree of Observability
Cho, Yong Hyeon
Park, Chan Gook
PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2020), 2020, : 502 - 505
[49] Path planning of robotic arm based on deep reinforcement learning algorithm
Al-Gabalawy M.
Advanced Control for Applications: Engineering and Industrial Systems, 2022, 4 (01):
[50] Reinforcement-Learning-Based Path Planning: A Reward Function Strategy
Jaramillo-Martinez, Ramon
Chavero-Navarrete, Ernesto
Ibarra-Perez, Teodoro
APPLIED SCIENCES-BASEL, 2024, 14 (17):

← 1 2 3 4 5 →