Sleeping Multi-Armed Bandit Learning for Fast Uplink Grant Allocation in Machine Type Communications

被引：24

作者：

Ali, Samad ^{[1
]}

Ferdowsi, Aidin ^{[2
]}

Saad, Walid ^{[2
,3
]}

Rajatheva, Nandana ^{[1
]}

Haapola, Jussi ^{[1
]}

机构：

[1] Univ Oulu, Ctr Wireless Commun CWC, Oulu 90570, Finland

[2] Virginia Tech, Bradley Dept Elect & Comp Engn, Wireless VT, Blacksburg, VA 24061 USA

[3] Kyung Hee Univ, Dept Comp Sci & Engn, Seoul 130701, South Korea

来源：

IEEE TRANSACTIONS ON COMMUNICATIONS | 2020年 / 68卷 / 08期

基金：

美国国家科学基金会; 芬兰科学院;

关键词：

Uplink; Resource management; Prediction algorithms; Delays; Quality of service; Wireless communication; Machine type communications; scheduling; fast uplink grant; multi-armed bandits; Internet of Things; RESOURCE-ALLOCATION; RANDOM-ACCESS; M2M COMMUNICATIONS; LTE; INTERNET;

D O I：

10.1109/TCOMM.2020.2989338

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Scheduling fast uplink grant transmissions for machine type communications (MTCs) is one of the main challenges of future wireless systems. In this paper, a novel fast uplink grant scheduling method based on the theory of multi-armed bandits (MABs) is proposed. First, a single quality-of-service metric is defined as a combination of the value of data packets, maximum tolerable access delay, and data rate. Since full knowledge of these metrics for all machine type devices (MTDs) cannot be known in advance at the base station (BS) and the set of active MTDs changes over time, the problem is modeled as a sleeping MAB with stochastic availability and a stochastic reward function. In particular, given that, at each time step, the knowledge on the set of active MTDs is probabilistic, a novel probabilistic sleeping MAB algorithm is proposed to maximize the defined metric. Analysis of the regret is presented and the effect of the prediction error of the source traffic prediction algorithm on the performance of the proposed sleeping MAB algorithm is investigated. Moreover, to enable fast uplink allocation for multiple MTDs at each time, a novel method is proposed based on the concept of best arms ordering in the MAB setting. Simulation results show that the proposed framework yields a three-fold reduction in latency compared to a maximum probability scheduling policy since it prioritizes the scheduling of MTDs that have stricter latency requirements. Moreover, by properly balancing the exploration versus exploitation tradeoff, the proposed algorithm selects the most important MTDs more often by exploitation. During exploration, the sub-optimal MTDs will be selected, which increases the fairness in the system, and, also provides a better estimate of the reward of the sub-optimal MTD.

引用

页码：5072 / 5086

页数：15

共 50 条

[1] Sleeping Multi-Armed Bandits for Fast Uplink Grant Allocation in Machine Type Communications
Ali, Samad
Ferdowsi, Aidin
Saad, Walid
Rajatheva, Nandana
[J]. 2018 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2018,
[2] Multi-Armed Bandit Framework for Resource Allocation in Uplink NOMA Networks
Benamor, Amani
Habachi, Oussama
Kammoun, Ines
Cances, Jean-Pierre
[J]. 2023 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC, 2023,
[3] MULTI-ARMED BANDIT ALLOCATION INDEXES
JONES, PW
[J]. JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 1989, 40 (12) : 1158 - 1159
[4] Fast Uplink Grant for Machine Type Communications: Challenges and Opportunities
Ali, Samad
Rajatheva, Nandana
Saad, Walid
[J]. IEEE COMMUNICATIONS MAGAZINE, 2019, 57 (03) : 97 - 103
[5] Efficient Resource Allocation in Fast-Uplink Grant for Machine-Type Communications With NOMA
El Tanab, Manal
Hamouda, Walaa
[J]. IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (18): : 18113 - 18129
[6] Multi-user lax communications: a multi-armed bandit approach
Avner, Orly
Mannor, Shie
[J]. IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
[7] DYNAMIC ALLOCATION INDEX FOR THE DISCOUNTED MULTI-ARMED BANDIT PROBLEM
GITTINS, JC
JONES, DM
[J]. BIOMETRIKA, 1979, 66 (03) : 561 - 565
[8] Adaptive Active Learning as a Multi-armed Bandit Problem
Czarnecki, Wojciech M.
Podolak, Igor T.
[J]. 21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 989 - 990
[9] Multi-armed Bandit Algorithms for Adaptive Learning: A Survey
Mui, John
Lin, Fuhua
Dewan, M. Ali Akber
[J]. ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2021), PT II, 2021, 12749 : 273 - 278
[10] Mechanisms with learning for stochastic multi-armed bandit problems
Shweta Jain
Satyanath Bhat
Ganesh Ghalme
Divya Padmanabhan
Y. Narahari
[J]. Indian Journal of Pure and Applied Mathematics, 2016, 47 : 229 - 272

← 1 2 3 4 5 →