A Satisficing Strategy with Variable Reference in the Multi-armed Bandit Problems

被引:0
|
作者
Kohno, Yu [1 ]
Takahashi, Tatsuji [2 ]
机构
[1] Tokyo Denki Univ, Grad Sch Adv Sci & Technol, Hiki, Saitama 3500394, Japan
[2] Tokyo Denki Univ, Hiki, Saitama 3500394, Japan
关键词
Symmetric reasoning; decision-making; N armed bandit problem; speed-accuracy trade-off;
D O I
10.1063/1.4912815
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The loosely symmetric model (LS) is as a subjective probability model that came from human beings' cognitive characteristics. To suggest a value to apply human beings' cognitive characteristics, we developed a value function "loosely symmetric model with variable reference" (LS-aVR) that expanded LS in the decision-amaking. It is important how get a reference value having an agent from environment to determine whether an algorithm using LS-aVR explores in comparison with a reference value. In this study, we proposed using statistical knowledge in an online method to acquire a reference value. Therefore we succeeded in making the result that new method exceeded a superior existing model in the multi-aarmed banded problem that is a kind of decision-amaking problems.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Scaling Multi-Armed Bandit Algorithms
    Fouche, Edouard
    Komiyama, Junpei
    Boehm, Klemens
    KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 1449 - 1459
  • [32] The budgeted multi-armed bandit problem
    Madani, O
    Lizotte, DJ
    Greiner, R
    LEARNING THEORY, PROCEEDINGS, 2004, 3120 : 643 - 645
  • [33] The Multi-Armed Bandit With Stochastic Plays
    Lesage-Landry, Antoine
    Taylor, Joshua A.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (07) : 2280 - 2286
  • [34] Multi-armed Bandit with Additional Observations
    Yun, Donggyu
    Proutiere, Alexandre
    Ahn, Sumyeong
    Shin, Jinwoo
    Yi, Yung
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2018, 2 (01)
  • [35] IMPROVING STRATEGIES FOR THE MULTI-ARMED BANDIT
    POHLENZ, S
    MARKOV PROCESS AND CONTROL THEORY, 1989, 54 : 158 - 163
  • [36] MULTI-ARMED BANDIT ALLOCATION INDEXES
    JONES, PW
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 1989, 40 (12) : 1158 - 1159
  • [37] THE MULTI-ARMED BANDIT PROBLEM WITH COVARIATES
    Perchet, Vianney
    Rigollet, Philippe
    ANNALS OF STATISTICS, 2013, 41 (02): : 693 - 721
  • [38] The Multi-fidelity Multi-armed Bandit
    Kandasamy, Kirthevasan
    Dasarathy, Gautam
    Schneider, Jeff
    Poczos, Barnabas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [39] Multi-armed Bandit with Additional Observations
    Yun D.
    Ahn S.
    Proutiere A.
    Shin J.
    Yi Y.
    2018, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (46): : 53 - 55
  • [40] A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem
    Madhushani, Udari
    Leonard, Naomi Ehrich
    2020 EUROPEAN CONTROL CONFERENCE (ECC 2020), 2020, : 1677 - 1682