Combinatorial Multi-armed Bandits for Resource Allocation

被引:1
|
作者
Zuo, Jinhang [1 ]
Joe-Wong, Carlee [1 ]
机构
[1] Carnegie Mellon Univ, Dept Elect & Comp Engn, Pittsburgh, PA 15213 USA
关键词
Multi-armed Bandits; Resource Allocation; OPTIMIZATION;
D O I
10.1109/CISS50987.2021.9400228
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the sequential resource allocation problem where a decision maker repeatedly allocates budgets between resources. Motivating examples include allocating limited computing time or wireless spectrum bands to multiple users (i.e., resources). At each timestep, the decision maker should distribute its available budgets among different resources to maximize the expected reward, or equivalently to minimize the cumulative regret. In doing so, the decision maker should learn the value of the resources allocated for each user from feedback on each user's received reward. For example, users may send messages of different urgency over wireless spectrum bands; the reward generated by allocating spectrum to a user then depends on the message's urgency. We assume each user's reward follows a random process that is initially unknown. We design combinatorial multi-armed bandit algorithms to solve this problem with discrete or continuous budgets. We prove the proposed algorithms achieve logarithmic regrets under semi-bandit feedback.
引用
收藏
页数:4
相关论文
共 50 条
  • [41] Multi-Armed Bandits for Spectrum Allocation in Multi-Agent Channel Bonding WLANs
    Barrachina-Munoz, Sergio
    Chiumento, Alessandro
    Bellalta, Boris
    IEEE ACCESS, 2021, 9 : 133472 - 133490
  • [42] Starlet: Network defense resource allocation with multi-armed bandits for cloud-edge crowd sensing in IoT
    Xia, Hui
    Huang, Ning
    Feng, Xuecai
    Zhang, Rui
    Liu, Chao
    DIGITAL COMMUNICATIONS AND NETWORKS, 2024, 10 (03) : 586 - 596
  • [43] Resource Allocation in NOMA-Based Self-Organizing Networks Using Stochastic Multi-Armed Bandits
    Youssef, Marie-Josepha
    Veeravalli, Venugopal V.
    Farah, Joumana
    Nour, Charbel Abdel
    Douillard, Catherine
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2021, 69 (09) : 6003 - 6017
  • [44] Starlet: Network defense resource allocation with multi-armed bandits for cloud-edge crowd sensing in IoT
    Hui Xia
    Ning Huang
    Xuecai Feng
    Rui Zhang
    Chao Liu
    Digital Communications and Networks, 2024, 10 (03) : 586 - 596
  • [45] Universal Dynamic Pilot Allocation for Beam Alignment Based on Multi-Armed Bandits
    Lee, Hyun-Suk
    Kim, Do-Yup
    Min, Kyungsik
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2024, 13 (03) : 756 - 760
  • [46] Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards
    Arya, Sakshi
    Yang, Yuhong
    STATISTICS & PROBABILITY LETTERS, 2020, 164
  • [47] LEVY BANDITS: MULTI-ARMED BANDITS DRIVEN BY LEVY PROCESSES
    Kaspi, Haya
    Mandelbaum, Avi
    ANNALS OF APPLIED PROBABILITY, 1995, 5 (02): : 541 - 565
  • [48] Successive Reduction of Arms in Multi-Armed Bandits
    Gupta, Neha
    Granmo, Ole-Christoffer
    Agrawala, Ashok
    RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXVIII: INCORPORATING APPLICATIONS AND INNOVATIONS IN INTELLIGENT SYSTEMS XIX, 2011, : 181 - +
  • [49] Quantum greedy algorithms for multi-armed bandits
    Hiroshi Ohno
    Quantum Information Processing, 22
  • [50] Online Multi-Armed Bandits with Adaptive Inference
    Dimakopoulou, Maria
    Ren, Zhimei
    Zhou, Zhengyuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34