Matching in Multi-arm Bandit with Collision

被引:0
|
作者
Zhang, Yirui [1 ]
Wang, Siwei [2 ]
Fang, Zhixuan [1 ,3 ]
机构
[1] Tsinghua Univ, IIIS, Beijing, Peoples R China
[2] Microsoft Res, Redmond, WA USA
[3] Shanghai Qi Zhi Inst, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we consider the matching of multi-agent multi-armed bandit problem, i.e., while agents prefer arms with higher expected reward, arms also have preferences on agents. In such case, agents pulling the same arm may encounter collisions, which leads to a reward of zero. For this problem, we design a specific communication protocol which uses deliberate collision to transmit information among agents, and propose a layer-based algorithm that helps establish optimal stable matching between agents and arms. With this subtle communication protocol, our algorithm achieves a state-of-the-art O(log T) regret in the decentralized matching market, and outperforms existing baselines in experimental results.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Solving Multi-Arm Bandit Using a Few Bits of Communication
    Hanna, Osama A.
    Yang, Lin F.
    Fragouli, Christina
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [2] Repeated Randomization and Matching in Multi-Arm Trials
    Xu, Zhenzhen
    Kalbfleisch, John D.
    [J]. BIOMETRICS, 2013, 69 (04) : 949 - 959
  • [3] Collision free control of a multi-arm testing robot
    Yamanashi Univ Takeda, Yamanashi, Japan
    [J]. CIRP Ann Manuf Technol, 1 (5-8):
  • [4] Collision free control of a multi-arm testing robot
    Terada, H
    Makino, H
    [J]. CIRP ANNALS 1998 - MANUFACTURING TECHNOLOGY, VOL 47, NO 1, 1998, 47 : 5 - 8
  • [5] Stochastic programming based multi-arm bandit offloading strategy for internet of things
    Cao, Bin
    Wu, Tingyong
    Bai, Xiang
    [J]. DIGITAL COMMUNICATIONS AND NETWORKS, 2023, 9 (05) : 1200 - 1211
  • [6] Stochastic programming based multi-arm bandit offloading strategy for internet of things
    Bin Cao
    Tingyong Wu
    Xiang Bai
    [J]. Digital Communications and Networks, 2023, 9 (05) : 1200 - 1211
  • [7] Fast algorithm for collision detection of joints of multi-arm robot
    Dept. of Comput. Sci. and Eng., Dalian Univ. of Technol., Dalian 116024, China
    不详
    不详
    [J]. Dalian Ligong Daxue Xuebao, 2007, 4 (527-532):
  • [8] A Detector-Oblivious Multi-Arm Network for Keypoint Matching
    Shen, Xuelun
    Hu, Qian
    Li, Xin
    Wang, Cheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 2776 - 2785
  • [9] Path Planning and Collision Avoidance for a Multi-Arm Space Maneuverable Robot
    Chu, Xiaoyu
    Hu, Quan
    Zhang, Jingrui
    [J]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2018, 54 (01) : 217 - 232
  • [10] Recurrent Network and Multi-arm Bandit Methods for Multi-task Learning without Task Specification
    Thy Nguyen
    Obafemi-Ajayi, Tayo
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,