Probabilistic Automata-Based Method for Enhancing Performance of Deep Reinforcement Learning Systems

被引：0

作者：

Min Yang ^{[1
]}

Guanjun Liu ^{[2
,1
]}

Ziyuan Zhou ^{[1
]}

Jiacun Wang ^{[2
,3
]}

机构：

[1] the Department of Computer Science, Tongji University

[2] IEEE

[3] the Computer Science and Software Engineering Department, Monmouth University, West Long

来源：

IEEE/CAA Journal of Automatica Sinica | 2024年 / 11卷 / 11期

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Deep reinforcement learning(DRL) has demonstrated significant potential in industrial manufacturing domains such as workshop scheduling and energy system management.However, due to the model's inherent uncertainty, rigorous validation is requisite for its application in real-world tasks. Specific tests may reveal inadequacies in the performance of pre-trained DRL models, while the “black-box” nature of DRL poses a challenge for testing model behavior. We propose a novel performance improvement framework based on probabilistic automata,which aims to proactively identify and correct critical vulnerabilities of DRL systems, so that the performance of DRL models in real tasks can be improved with minimal model modifications.First, a probabilistic automaton is constructed from the historical trajectory of the DRL system by abstracting the state to generate probabilistic decision-making units(PDMUs), and a reverse breadth-first search(BFS) method is used to identify the key PDMU-action pairs that have the greatest impact on adverse outcomes. This process relies only on the state-action sequence and final result of each trajectory. Then, under the key PDMU, we search for the new action that has the greatest impact on favorable results. Finally, the key PDMU, undesirable action and new action are encapsulated as monitors to guide the DRL system to obtain more favorable results through real-time monitoring and correction mechanisms. Evaluations in two standard reinforcement learning environments and three actual job scheduling scenarios confirmed the effectiveness of the method, providing certain guarantees for the deployment of DRL models in real-world applications.

引用

页码：2327 / 2339

页数：13

共 50 条

[41] Probabilistic Guarantees for Safe Deep Reinforcement Learning
Bacci, Edoardo
Parker, David
FORMAL MODELING AND ANALYSIS OF TIMED SYSTEMS, FORMATS 2020, 2020, 12288 : 231 - 248
[42] Learning Automata-based Misinformation Mitigation via Hawkes Processes
Abouzeid, Ahmed
Granmo, Ole-Christoffer
Webersik, Christian
Goodwin, Morten
INFORMATION SYSTEMS FRONTIERS, 2021, 23 (05) : 1169 - 1188
[43] A Learning Automata-Based Compression Scheme for Convolutional Neural Network
Feng, Shuai
Guo, Haonan
Yang, Jichao
Xu, Zhengwu
Li, Shenghong
COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL III: SYSTEMS, 2020, 517 : 42 - 49
[44] An adaptive learning automata-based ranking function discovery algorithm
Javad Akbari Torkestani
Journal of Intelligent Information Systems, 2012, 39 : 441 - 459
[45] Verified Probabilistic Policies for Deep Reinforcement Learning
Bacci, Edoardo
Parker, David
NASA FORMAL METHODS (NFM 2022), 2022, 13260 : 193 - 212
[46] Intelligent learning automata-based objective function in RPL for IoT
Saleem, Ahsan
Afzal, Muhammad Khalil
Ateeq, Muhammad
Kim, Sung Won
Bin Zikria, Yousaf
SUSTAINABLE CITIES AND SOCIETY, 2020, 59
[47] WDM passive star networks: A learning automata-based architecture
Papadimitriou, GI
Maritsas, DG
COMPUTER COMMUNICATIONS, 1996, 19 (6-7) : 580 - 589
[48] WDM passive star networks: a learning automata-based architecture
Univ of Patras, Patras, Greece
Comput Commun, 6-7 (580-589):
[49] A new learning automata-based sampling algorithm for social networks
Rezvanian, Alireza
Meybodi, Mohammad Reza
INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2017, 30 (05)
[50] Enhancing HVAC control systems through transfer learning with deep reinforcement learning agents
Kadamala, Kevlyn
Chambers, Des
Barrett, Enda
SMART ENERGY, 2024, 13

← 1 2 3 4 5 →