Multilevel Constrained Bandits: A Hierarchical Upper Confidence Bound Approach with Safety Guarantees

被引:0
|
作者
Baheri, Ali [1 ]
机构
[1] Rochester Inst Technol, Dept Mech Engn, Rochester, NY 14623 USA
关键词
multi-armed bandit; constrained optimization; decision making under uncertainty;
D O I
10.3390/math13010149
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The multi-armed bandit (MAB) problem is a foundational model for sequential decision-making under uncertainty. While MAB has proven valuable in applications such as clinical trials and online advertising, traditional formulations have limitations; specifically, they struggle to handle three key real-world scenarios: (1) when decisions must follow a hierarchical structure (as in autonomous systems where high-level strategy guides low-level actions); (2) when there are constraints at multiple levels of decision-making (such as both system-wide and component-level resource limits); and (3) when available actions depend on previous choices or context. To address these challenges, we introduce the hierarchical constrained bandits (HCB) framework, which extends contextual bandits to incorporate both hierarchical decisions and multilevel constraints. We propose the HC-UCB (hierarchical constrained upper confidence bound) algorithm to solve the HCB problem. The algorithm uses confidence bounds within a hierarchical setting to balance exploration and exploitation while respecting constraints at all levels. Our theoretical analysis establishes that HC-UCB achieves sublinear regret, guarantees constraint satisfaction at all hierarchical levels, and is near-optimal in terms of achievable performance. Simple experimental results demonstrate the algorithm's effectiveness in balancing reward maximization with constraint satisfaction.
引用
收藏
页数:20
相关论文
共 13 条
  • [1] Imitation Upper Confidence Bound for Bandits on a Graph
    Lupu, Andrei
    Precup, Doina
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 8113 - 8114
  • [2] Asynchronous Upper Confidence Bound Algorithms for Federated Linear Bandits
    Li, Chuanhao
    Wang, Hongning
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 6529 - 6553
  • [3] Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits
    Roy, Kaushik
    Zhang, Qi
    Gaur, Manas
    Sheth, Amit
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, 2021, 12975 : 35 - 50
  • [4] Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits
    Carpentier, Alexandra
    Lazaric, Alessandro
    Ghavamzadeh, Mohammad
    Munos, Remi
    Auer, Peter
    ALGORITHMIC LEARNING THEORY, 2011, 6925 : 189 - +
  • [5] Upper Confidence Bound Learning Approach for Real HF Measurements
    Melian-Gutierrez, Laura
    Modi, Navikkumar
    Moy, Christophe
    Perez-Alvarez, Ivan
    Bader, Faouzi
    Zazo, Santiago
    2015 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION WORKSHOP (ICCW), 2015, : 381 - 386
  • [6] AN UPPER CONFIDENCE BOUND APPROACH TO ESTIMATING COHERENT RISK MEASURES
    Liu, Guangwu
    Shi, Wen
    Zhang, Kun
    2019 WINTER SIMULATION CONFERENCE (WSC), 2019, : 914 - 925
  • [7] Enhanced Experience Prioritization: A Novel Upper Confidence Bound Approach
    Kovari, Balint
    Pelenczei, Balint
    Becsi, Tamas
    IEEE ACCESS, 2023, 11 : 138488 - 138501
  • [8] DUCT: An Upper Confidence Bound Approach to Distributed Constraint Optimization Problems
    Ottens, Brammert
    Dimitrakakis, Christos
    Faltings, Boi
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2017, 8 (05)
  • [9] An informative path planning approach for mobile robots based on upper confidence bound algorithm
    Wang Y.-Q.
    Wu Z.-L.
    Li Q.-Z.
    Kongzhi yu Juece/Control and Decision, 2023, 38 (02): : 395 - 402
  • [10] RING FORMING - AN UPPER BOUND APPROACH .3. CONSTRAINED FORGING AND DEEP DRAWING APPLICATIONS
    AVITZUR, B
    VANTYNE, CJ
    JOURNAL OF ENGINEERING FOR INDUSTRY-TRANSACTIONS OF THE ASME, 1982, 104 (03): : 248 - 252