ON INFORMATION ASYMMETRY IN ONLINE REINFORCEMENT LEARNING

被引:0
|
作者
Tampubolon, Ezra [1 ]
Ceribasi, Haris [1 ]
Boche, Holger [1 ,2 ]
机构
[1] Tech Univ Munich, Lehrstuhl Theoret Informat Tech, Munich, Germany
[2] Munich Ctr Quantum Sci & Technol MCQST, Munich, Germany
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
Information Asymmetry; Q-learning; Markov Game; Reinforcement Learning; Resource Allocation; SECURITY;
D O I
10.1109/ICASSP39728.2021.9413968
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, we study the system of two interacting non-cooperative Q-learning agents, where one agent has the privilege of observing the other's actions. We show that this information asymmetry can lead to a stable outcome of population learning, which does not occur in an environment of general independent learners. Furthermore, we discuss the resulted post-learning policies, show that they are almost optimal in the underlying game sense, and provide numerical hints of almost welfare-optimal of the resulted policies.
引用
收藏
页码:4955 / 4959
页数:5
相关论文
共 50 条
  • [21] Optimization of Learning Cycles in Online Reinforcement Learning Systems
    Notsu, Akira
    Yasuda, Koji
    Ubukata, Seiki
    Honda, Katsuhiro
    2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 3530 - 3534
  • [22] Information Directed Reward Learning for Reinforcement Learning
    Lindner, David
    Turchetta, Matteo
    Tschiatschek, Sebastian
    Ciosek, Kamil
    Krause, Andreas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [23] Information asymmetry and small business in online auction market
    Sun, Chia-Hung
    Liu, Kang E.
    SMALL BUSINESS ECONOMICS, 2010, 34 (04) : 433 - 444
  • [24] Information asymmetry and small business in online auction market
    Chia-Hung Sun
    Kang E. Liu
    Small Business Economics, 2010, 34 : 433 - 444
  • [25] Intelligent Online Traffic Optimization Based on Deep Reinforcement Learning for Information-Centric Networks
    Yu, Hongzhi
    Qian, Xingxin
    Qin, Guangyi
    Ren, Jing
    Wang, Xiong
    EMERGING NETWORKING ARCHITECTURE AND TECHNOLOGIES, ICENAT 2022, 2023, 1696 : 598 - 613
  • [26] Quasi-online reinforcement learning for robots
    Bakker, Bram
    Zhumatiy, Viktor
    Gruener, Gabriel
    Schmidhuber, Juergen
    2006 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), VOLS 1-10, 2006, : 2997 - +
  • [27] Reducing reinforcement learning to KWIK online regression
    Li, Lihong
    Littman, Michael L.
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2010, 58 (3-4) : 217 - 237
  • [28] Online inverse reinforcement learning with limited data
    Self, Ryan
    Mahmud, S. M. Nahid
    Hareland, Katrine
    Kamalapurkar, Rushikesh
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 603 - 608
  • [29] Efficient Online Reinforcement Learning with Offline Data
    Ball, Philip J.
    Smith, Laura
    Kostrikov, Ilya
    Levine, Sergey
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [30] Reinforcement learning for online control of evolutionary algorithms
    Eiben, A. E.
    Horvath, Mark
    Kowalczyk, Wojtek
    Schut, Martijn C.
    ENGINEERING SELF-ORGANISING SYSTEMS, 2007, 4335 : 151 - +