A Multi-armed Bandit Algorithm Available in Stationary or Non-stationary Environments Using Self-organizing Maps

被引:1
|
作者
Manome, Nobuhito [1 ,2 ]
Shinohara, Shuji [2 ]
Suzuki, Kouta [1 ,2 ]
Tomonaga, Kosuke [1 ,2 ]
Mitsuyoshi, Shunji [2 ]
机构
[1] SoftBank Robot Grp Corp, Tokyo, Japan
[2] Univ Tokyo, Tokyo, Japan
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: THEORETICAL NEURAL COMPUTATION, PT I | 2019年 / 11727卷
关键词
Multi-armed bandit problem; Self-organizing maps; Sequential decision making;
D O I
10.1007/978-3-030-30487-4_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the multitude of potential courses of action, communication robots designed to satisfy the users facing them must take appropriate action more rapidly. In practice however, user requests often change while these robots are determining the most appropriate actions for these users. Therefore, it is difficult for robots to derive an appropriate course of action. This issue has been formalized as the "multi-armed bandit (MAB) problem." The MAB problem points to an environment featuring multiple levers (arms) where pulling an arm has a certain probability of yielding a reward; the issue is to determine how to select the levers to pull to maximize the rewards gained. To solve this problem, we considered a new MAB problem algorithm using self-organizing maps that is adaptable to stationary and non-stationary environments. For this paper, numerous experiments were conducted considering a stochastic MAB problem in both stationary and non-stationary environments. As a result, we determined that the proposed algorithm demonstrated equivalent or improved capability in stationary environments with numerous arms and consistently strong effectiveness in a non-stationary environment compared to the existing UCB1, UCB1-Tuned, and Thompson Sampling algorithms.
引用
收藏
页码:529 / 540
页数:12
相关论文
共 50 条
  • [41] EVOLUTIONARY MULTI-AGENT SYSTEMS IN NON-STATIONARY ENVIRONMENTS
    Kisiel-Dorohinicki, Marek
    COMPUTER SCIENCE-AGH, 2013, 14 (04): : 563 - 575
  • [42] Detecting Anomalies by using Self-Organizing Maps in Industrial Environments
    Hormann, Ricardo
    Fischer, Eric
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS SECURITY AND PRIVACY (ICISSP), 2019, : 336 - 344
  • [43] Combining stationary wavelet transform and self-organizing maps for brain MR image segmentation
    Demirhan, Ayse
    Gueler, Inan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2011, 24 (02) : 358 - 367
  • [44] A Bayesian Multi-Armed Bandit Algorithm for Dynamic End-to-End Routing in SDN-Based Networks with Piecewise-Stationary Rewards
    Santana, Pedro
    Moura, Jose
    ALGORITHMS, 2023, 16 (05)
  • [45] A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments
    Abdelfattah, Sherif
    Kasmarik, Kathryn
    Hu, Jiankun
    ADAPTIVE BEHAVIOR, 2020, 28 (04) : 273 - 292
  • [46] Efficient wireless network selection by using multi-armed bandit algorithm for mobile terminals
    Oshima, Koji
    Onishi, Takuma
    Kim, Song-Ju
    Ma, Jing
    Hasegawa, Mikio
    IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2020, 11 (01): : 68 - 77
  • [47] Secure Channel Selection Using Multi-Armed Bandit Algorithm in Cognitive Radio Network
    Endo, Masahiro
    Ohtsuki, Tomoaki
    Fujii, Takeo
    Takyu, Osamu
    2017 IEEE 85TH VEHICULAR TECHNOLOGY CONFERENCE (VTC SPRING), 2017,
  • [48] A Channel Allocation Algorithm for Cognitive Radio Systems using Restless Multi-armed Bandit
    Lee, Hyuk
    Lee, Jungwoo
    2013 IEEE 78TH VEHICULAR TECHNOLOGY CONFERENCE (VTC FALL), 2013,
  • [49] Self-localization in non-stationary environments using omni-directional vision
    Andreasson, Henrik
    Treptow, Andre
    Duckett, Tom
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2007, 55 (07) : 541 - 551
  • [50] On some properties of the B-Cell algorithm in non-stationary environments
    Trojanowski, Krzysztof
    Wierzchon, Slawomir T.
    ADVANCES IN INFORMATION PROCESSING AND PROTECTION, 2007, : 35 - 44