Collaborative multi-agents in dynamic industrial internet of things using deep reinforcement learning

被引：3

作者：

Raza, Ali ^{[1
]}

Shah, Munam Ali ^{[1
]}

Khattak, Hasan Ali ^{[2
]}

Maple, Carsten ^{[3
]}

Al-Turjman, Fadi ^{[4
]}

Rauf, Hafiz Tayyab ^{[5
]}

机构：

[1] COMSATS Univ Islamabad, Dept Comp Sci, Islamabad 44000, Pakistan

[2] Natl Univ Sci & Technol NUST, Sch Elect Engn & Comp Sci, Islamabad 44500, Pakistan

[3] Univ Warwick, WMG, Secur Cyber Syst Res Grp, Coventry CV4 7AL, W Midlands, England

[4] Near East Univ, Res Ctr AI & IoT, Artificial Intelligence Dept, Mersin 10, Nicosia, Turkey

[5] Univ Bradford, Fac Engn & Informat, Dept Comp Sci, Bradford BD7 1AZ, W Yorkshire, England

来源：

ENVIRONMENT DEVELOPMENT AND SUSTAINABILITY | 2022年 / 24卷 / 07期

基金：

英国工程与自然科学研究理事会;

关键词：

Deep reinforcement learning; Multi-agents; Behavior cloning; Dynamic environment; Scalability; OBSTACLE AVOIDANCE; ENVIRONMENT; NAVIGATION; SYSTEMS;

D O I：

10.1007/s10668-021-01836-9

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Sustainable cities are envisioned to have economic and industrial steps toward reducing pollution. Many real-world applications such as autonomous vehicles, transportation, traffic signals, and industrial automation can now be trained using deep reinforcement learning (DRL) techniques. These applications are designed to take benefit of DRL in order to improve the monitoring as well as measurements in industrial internet of things for automation identification system. The complexity of these environments means that it is more appropriate to use multi-agent systems rather than a single-agent. However, in non-stationary environments multi-agent systems can suffer from increased number of observations, limiting the scalability of algorithms. This study proposes a model to tackle the problem of scalability in DRL algorithms in transportation domain. A partition-based approach is used in the proposed model to reduce the complexity of the environment. This partition-based approach helps agents to stay in their working area. This reduces the complexity of the learning environment and the number of observations for each agent. The proposed model uses generative adversarial imitation learning and behavior cloning, combined with a proximal policy optimization algorithm, for training multiple agents in a dynamic environment. We present a comparison of PPO, soft actor-critic, and our model in reward gathering. Our simulation results show that our model outperforms SAC and PPO in cumulative reward gathering and dramatically improved training multiple agents.

引用

页码：9481 / 9499

页数：19

共 50 条

[1] Collaborative multi-agents in dynamic industrial internet of things using deep reinforcement learning
Ali Raza
Munam Ali Shah
Hasan Ali Khattak
Carsten Maple
Fadi Al-Turjman
Hafiz Tayyab Rauf
[J]. Environment, Development and Sustainability, 2022, 24 : 9481 - 9499
[2] On deep reinforcement learning security for Industrial Internet of Things
Liu, Xing
Yu, Wei
Liang, Fan
Griffith, David
Golmie, Nada
[J]. COMPUTER COMMUNICATIONS, 2021, 168 : 20 - 32
[3] Deep Reinforcement Learning Approach for Flocking Control of Multi-agents
Zhang, Han
Cheng, Jin
[J]. 2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 5002 - 5007
[4] Optimization algorithm using multi-agents and reinforcement learning
Kobayashi, Y
Aiyoshi, E
[J]. CEC2004: PROCEEDINGS OF THE 2004 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2004, : 63 - 68
[5] Robust Collaborative Learning by Multi-Agents
Balasingam, B.
Pattipati, K.
Levchuck, G.
Romano, J. C.
[J]. 2015 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE FOR SECURITY AND DEFENSE APPLICATIONS (CISDA), 2015, : 183 - 187
[6] Deep Reinforcement Learning Multi-Agent System for Resource Allocation in Industrial Internet of Things
Rosenberger, Julia
Urlaub, Michael
Rauterberg, Felix
Lutz, Tina
Selig, Andreas
Buehren, Michael
Schramm, Dieter
[J]. SENSORS, 2022, 22 (11)
[7] Blockchain Sharding Strategy for Collaborative Computing Internet of Things Combining Dynamic Clustering and Deep Reinforcement Learning
Yang, Zhaoxin
Li, Meng
Yang, Ruizhe
Yu, F. Richard
Zhang, Yanhua
[J]. IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 2786 - 2791
[8] Sharded Blockchain for Collaborative Computing in the Internet of Things: Combined of Dynamic Clustering and Deep Reinforcement Learning Approach
Yang, Zhaoxin
Yang, Ruizhe
Yu, F. Richard
Li, Meng
Zhang, Yanhua
Teng, Yinglei
[J]. IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (17): : 16494 - 16509
[9] Managing Earth Hazards Using the Deep Reinforcement Learning Algorithm for the Industrial Internet of Things Network
Liu, Weiwei
[J]. PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2022, 88 (11): : 707 - 714
[10] Combinatorial optimization algorithm for permutation using multi-agents and reinforcement learning
Kobayashi, Y
Aiyoshi, E
[J]. SICE 2003 ANNUAL CONFERENCE, VOLS 1-3, 2003, : 2916 - 2920

← 1 2 3 4 5 →