Deep Reinforcement Learning based Task Scheduling in Mobile Blockchain for IoT Applications

被引：13

作者：

Gao, Yang ^{[1
]}

Wu, Wenjun ^{[1
]}

Nan, Haixiang ^{[1
]}

Sun, Yang ^{[1
]}

Si, Pengbo ^{[1
]}

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China

来源：

ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC) | 2020年

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

COMMUNICATION;

D O I：

10.1109/icc40277.2020.9148888

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Nowadays, the Internet of Things (IoT) has developed rapidly. To deal with the security problems in some of the IoT applications, blockchain has aroused lots of attention in both academia and industry. In this paper, we consider the mobile blockchain supporting IoT applications, and the mobile edge computing (MEC) is deployed at the Small-cell Base Station (SBS) as a supplement to enhance the computation ability of IoT devices. To encourage the participation of the SBS in the mobile blockchain networks, the long-term revenue of the SBS is considered. The task scheduling problem maximizing the long-term mining reward and minimizing the resource cost of the SBS is formulated as a Markov Decision Process (MDP). To achieve an efficient intelligent strategy, the deep reinforcement learning (DRL) based solution named policy gradient based computing tasks scheduling (PG-CTS) algorithm is proposed. The policy mapping from the system state to the task scheduling decision is represented by a deep neural network. The episodic simulations are built and the REINFORCE algorithm with baseline is used to train the policy network. According to the training results, the PG-CTS method is about 10% better than the second-best method greedy. The generalization ability of PG-CTS is proved theoretically, and the testing results also show that the PG-CTS method has better performance over the other three strategies, greedy, first-in-first-out (FIFO) and random in different environments.

引用

页数：7

共 50 条

[1] Deep Reinforcement Learning-Based Task Scheduling in IoT Edge Computing
Sheng, Shuran
Chen, Peng
Chen, Zhimin
Wu, Lenan
Yao, Yuxuan
[J]. SENSORS, 2021, 21 (05) : 1 - 19
[2] Deep Reinforcement Learning based Task Scheduling Scheme in Mobile Edge Computing Network
Zhao, Qi
Feng, Mingjie
Li, Li
Li, Yi
Liu, Hang
Chen, Genshe
[J]. SENSORS AND SYSTEMS FOR SPACE APPLICATIONS XIV, 2021, 11755
[3] Adaptive Task Offloading for Mobile Aware Applications Based on Deep Reinforcement Learning
Liu, Xianming
Zhang, Chaokun
He, Shen
[J]. 2022 IEEE 19TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SMART SYSTEMS (MASS 2022), 2022, : 33 - 39
[4] Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN
Zhou, Conghao
Wu, Wen
He, Hongli
Yang, Peng
Lyu, Feng
Cheng, Nan
Shen, Xuemin
[J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2021, 20 (02) : 911 - 925
[5] Privacy-Preserved Task Offloading in Mobile Blockchain With Deep Reinforcement Learning
Nguyen, Dinh C.
Pathirana, Pubudu N.
Ding, Ming
Seneviratne, Aruna
[J]. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2020, 17 (04): : 2536 - 2549
[6] Iot Data Processing and Scheduling Based on Deep Reinforcement Learning
Jiang, Yuchuan
Wang, Zhangjun
Jin, Zhixiong
[J]. INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2023, 18 (06)
[7] Deep Reinforcement Learning for Energy-Efficient Task Scheduling in SDN-based IoT Network
Sellami, Bassem
Hakiri, Akram
Ben Yahia, Sadok
Berthou, Pascal
[J]. 2020 IEEE 19TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2020,
[8] Deep reinforcement learning for blockchain in industrial IoT: A survey
Wu, Yulei
Wang, Zehua
Ma, Yuxiang
Leung, Victor C. M.
[J]. COMPUTER NETWORKS, 2021, 191
[9] Task scheduling for control system based on deep reinforcement learning
Liu, Yuhao
Ni, Yuqing
Dong, Chang
Chen, Jun
Liu, Fei
[J]. NEUROCOMPUTING, 2024, 610
[10] Adaptive task scheduling in IoT using reinforcement learning
Pandit, Mohammad Khalid
Mir, Roohie Naaz
Chishti, Mohammad Ahsan
[J]. INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2020, 13 (03) : 261 - 282

← 1 2 3 4 5 →